Measure 



Sty 



An Introduction to 

Observational Astronomy 



Frederick R. Chrome v 




vuw^.cambridgie.Or^'9 , ?3t]lS21 /G.WjH 



This page intentionally left blank 



To Measure the Sky 

An Introduction to Observational Astronomy 

With a lively yet rigorous and quantitative approach, Frederick R. Chromey 
introduces the fundamental topics in optical observational astronomy for 
undergraduates. 

Focussing on the basic principles of light detection, telescope optics, coordinate 
systems and data analysis, the text introduces students to modem observing 
techniques and measurements. It approaches cutting-edge technologies such as 
advanced CCD detectors, integral field spectrometers, and adaptive optics 
through the physical principles on which they are based, helping students to 
understand the power of modem space and ground-based telescopes, and the 
motivations for and limitations of future development. Discussion of statistics 
and measurement uncertainty enables students to confront the important 
questions of data quality. 

It explains the theoretical foundations for observational practices and reviews 
essential physics to support students’ mastery of the subject. Subject 
understanding is strengthened through over 120 exercises and problems. 
Chromey’s purposeful structure and clear approach make this an essential 
resource for all student of observational astronomy. 

Frederick R. Chromey is Professor of Astronomy on the Matthew Vassar 
Junior Chair at Vassar College, and Director of the Vassar College Observatory. 
He has almost 40 years’ experience in observational astronomy research in the 
optical, radio, and near infrared on stars, gaseous nebulae, and galaxies, and has 
taught astronomy to undergraduates for 35 years at Brooklyn College and 
Vassar. 



To Measure the Sky 

An Introduction to 
Observational Astronomy 



Frederick R. Chromey 

Vassar College 




Cambridge 

UNIVERSITY PRESS 



CAMBRIDGE UNIVERSITY PRESS 

Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, 

Sao Paulo, Delhi, Dubai, Tokyo 

Cambridge University Press 

The Edinburgh Building, Cambridge CB2 8RU, UK 

Published in the United States of America by Cambridge University Press, New York 
www. Cambridge . org 

Information on this title: www.cambridge.org/9780521 763868 
© F. Chromey 2010 



This publication is in copyright. Subject to statutory exception and to the 
provision of relevant collective licensing agreements, no reproduction of any part 
may take place without the written permission of Cambridge University Press. 

First published in print format 2010 



ISBN-13 978-0-511-72954-6 eBook (NetLibrary) 

ISBN-13 978-0-521-76386-8 Hardback 

ISBN- 13 978-0-521-74768-4 Paperback 

Cambridge University Press has no responsibility for the persistence or accuracy 
of urls for external or third-party internet websites referred to in this publication, 
and does not guarantee that any content on such websites is, or will remain, 
accurate or appropriate. 




To Molly 



Contents 



Preface 


xi 


1 Light 


1 


1.1 The story 


1 


1.2 The evidence: astronomical data 


3 


1.3 Models for the behavior of light 


6 


1 .4 Measurements of light rays 


13 


1.5 Spectra 


16 


1.6 Magnitudes 


26 


Summary 


31 


Exercises 


32 


2 Uncertainty 


35 


2.1 Accuracy and precision 


35 


2.2 Populations and samples 


41 


2.3 Probability distributions 


47 


2.4 Estimating uncertainty 


51 


2.5 Propagation of uncertainty 


53 


2.6 Additional topics 


56 


Summary 


56 


Exercises 


57 


3 Place, time, and motion 


60 


3.1 Astronomical coordinate systems 


61 


3.2 The third dimension 


77 


3.3 Time 


83 


3.4 Motion 


88 


Summary 


93 


Exercises 


95 


4 Names, catalogs, and databases 


98 


4.1 Star names 


99 



VI 



Contents 



VIII 



4.2 Names and catalogs of non-stellar objects outside the Solar System 108 

4.3 Objects at non-optical wavelengths 1 12 

4.4 Atlases and finding charts 1 12 

4.5 Websites and other computer resources 1 14 

4.6 Solar System objects 1 14 

Summary 116 

Exercises 117 

5 Optics for astronomy 118 

5.1 Principles of geometric optics 118 

5.2 Lenses, mirrors, and simple optical elements 127 

5.3 Simple telescopes 135 

5.4 Image quality: telescopic resolution 137 

5.5 Aberrations 143 

Summary 154 

Exercises 155 

6 Astronomical telescopes 157 

6.1 Telescope mounts and drives 157 

6.2 Reflecting telescope optics 160 

6.3 Telescopes in space 168 

6.4 Ground-based telescopes 175 

6.5 Adaptive optics 180 

6.6 The next stage: ELTs and advanced AO 190 

Summary 1 92 

Exercises 193 

7 Matter and light 196 

7.1 Isolated atoms 196 

7.2 Isolated molecules 204 

7.3 Solid-state crystals 205 

7.4 Photoconductors 218 

7.5 The MOS capacitor 219 

7.6 The p—n junction 221 

7.7 The vacuum photoelectric effect 227 

7.8 Superconductivity 229 

Summary 232 

Exercises 233 

8 Detectors 235 

8.1 Detector characterization 236 

8.2 The CCD 243 

8.3 Photo-emissive devices 260 



Contents 



IX 



8.4 Infrared arrays 265 

8.5 Thermal detectors 269 

Summary 272 

Exercises 273 

9 Digital images from arrays 275 

9.1 Arrays 275 

9.2 Digital image manipulation 281 

9.3 Preprocessing array data: bias, linearity, dark, flat, and fringe 286 

9.4 Combining images 297 

9.5 Digital aperture photometry 309 

Summary 320 

Exercises 321 

10 Photometry 323 

10.1 Introduction: a short history 323 

10.2 The response function 326 

10.3 The idea of a photometric system 336 

10.4 Common photometric systems 337 

10.5 From source to telescope 344 

10.6 The atmosphere 350 

10.7 Transformation to a standard system 360 

Summary 363 

Exercises 364 

11 Spectrometers 368 

11.1 Dispersive spectrometry 369 

11.2 Dispersing optical elements 371 

11.3 Spectrometers without slits 380 

11.4 Basic slit and fiber spectrometers 382 

11.5 Spectrometer design for astronomy 385 

11.6 Spectrometric data 393 

11.7 Interpreting spectra 399 

Summary 402 

Exercises 404 

Appendices 407 

References 437 

Index 441 



Preface 



There is an old joke: a lawyer, a priest, and an observational astronomer walk 
into a bar. The bartender turns out to be a visiting extraterrestrial who presents 
the trio with a complicated-looking black box. The alien first demonstrates that 
when a bucket ful of garbage is fed into the entrance chute of the box, a small 
bag of high-quality diamonds and a gallon of pure water appear at its output. 
Then, assuring the three that the machine is his gift to them, the bartender 
vanishes. 

The lawyer says, “Boys, we’re rich! It’s the goose that lays the golden egg! 
We need to form a limited partnership so we can keep this thing secret and share 
the profits.” 

The priest says, “No, no, my brothers, we need to take this to the United 
Nations, so it can benefit all humanity.” 

“We can decide all that later,” the observational astronomer says. “Get me a 
screwdriver. 1 need to take this thing apart and see how it works.” 

This text grew out of 16 years of teaching observational astronomy to under- 
graduates, where my intent has been partly to satisfy — but mainly to cultivate — 
my students’ need to look inside black boxes. The text introduces the primary 
tools for making astronomical observations at visible and infrared wavelengths: 
telescopes, detectors, cameras, and spectrometers, as well as the methods for 
securing and understanding the quantitative measurements they make. I hope 
that after this introductory text, none of these tools will remain a completely 
black box, and that the reader will be ready to use them to pry into other boxes. 

The book, then, aims at an audience similar to my students: nominally second- 
or third-year science majors, but with a sizeable minority containing advanced 
first-year students, non-science students, and adult amateur astronomers. About 
three-quarters of those in my classes are not bound for graduate school in 
astronomy or physics, and the text has that set of backgrounds in mind. 

I assume my students have little or no preparation in astronomy, but do 
presume that each has had one year of college-level physics and an introduction 
to integral and differential calculus. A course in modem physics, although very 
helpful, is not essential. I make the same assumptions about readers of this book. 
Since readers’ mastery of physics varies, I include reviews of the most relevant 



XII 



Preface 



physical concepts: optics, atomic structure, and solid-state physics. I also include 
a brief introduction to elementary statistics. I have written qualitative chapter 
summaries, but the problems posed at the end of each chapter are all quantitative 
exercises meant to strengthen and further develop student understanding. 

My approach is to be rather thorough on fundamental topics in astronomy, in 
the belief that individual instructors will supply enrichment in specialized areas 
as they see fit. I regard as fundamental: 

• the interaction between light and matter at the atomic level, both in the case of the 
formation of the spectrum of an object and in the case of the detection of light by an 
instrument 

• the role of uncertainty in astronomical measurement 

• the measurement of position and change of position 

• the reference to and bibliography of astronomical objects, particularly with modern 
Internet-based systems like simbad and ADS 

• the principles of modern telescope design, including space telescopes, extremely large 
telescopes and adaptive optics systems 

• principles of operation of the charge-coupled device (CCD) and other array detectors; 
photometric and spectroscopic measurements with these arrays 

• treatment of digital array data: preprocessing, calibration, background removal, co- 
addition and signal-to-noise estimation 

• the design of modem spectrometers. 

The text lends itself to either a one- or two-semester course. I personally use 
the book for a two-semester sequence, where, in addition to the entire text and 
its end-of-chapter problems, I incorporate a number of at-the-telescope projects 
both for individuals and for “research teams” of students. 1 try to vary the large 
team projects: these have included a photometric time series of a variable object 
(in different years an eclipsing exoplanetary system, a Cepheid, and a blazar), an 
H— R diagram, and spectroscopy of the atmosphere of a Jovian planet. I am 
mindful that astronomers who teach with this text will have their own special 
interests in particular objects or techniques, and will have their own limitations 
and capabilities for student access to telescopes and equipment. My very firm 
belief, though, is that this book will be most effective if the instructor can devise 
appropriate exercises that require students to put their hands on actual hardware 
to measure actual photons from the sky. 

To use the text for a one-semester course, the instructor will have to skip 
some topics. Certainly, if students are well prepared in physics and mathe- 
matics, one can dispense with some or all of Chapter 2 (statistics), Chapter 5 
(geometrical optics), and Chapter 7 (atomic and solid-state physics), and pos- 
sibly all detectors (Chapter 8) except the CCD. One would still need to choose 
between a more thorough treatment of photometry (skipping Chapter 11, on 
spectrometers), or the inclusion of spectrometry and exclusion of some photo- 
metric topics (compressing the early sections of both Chapters 9 and 10). 



Preface 



XIII 



Compared with other texts, this book has strengths and counterbalancing 
weaknesses. I have taken some care with the physical and mathematical treat- 
ment of basic topics, like detection, uncertainty, optical design, astronomical 
seeing, and space telescopes, but at the cost of a more descriptive or encyclo- 
pedic survey of specialized areas of concern to observers (e.g. little treatment of 
the details of astrometry or of variable star observing). I believe the book is an 
excellent fit for courses in which students will do their own optical/infrared 
observing. Because 1 confine myself to the optical/infrared, 1 can develop ideas 
more systematically, beginning with those that arise from fundamental astro- 
nomical questions like position, brightness, and spectrum. But that confinement 
to a narrow range of the electromagnetic spectrum makes the book less suitable 
for a more general survey that includes radio or X-ray techniques. 

The sheer number of people and institutions contributing to the production of 
this book makes an adequate acknowledgment of all those to whom 1 am 
indebted impossible. Inadequate thanks are better than none at all, and 1 am 
deeply grateful to all who helped along the way. 

A book requires an audience. The audience I had uppermost in mind was 
filled with those students brave enough to enroll in my Astronomy 240-340 
courses at Vassar College. Over the years, more than a hundred of these students 
have challenged and rewarded me. All made contributions that found their way 
into this text, but especially those who asked the hardest questions: Megan 
Vogelaar Connelly, Liz McGrath, Liz Blanton, Sherri Stephan, David Hassel- 
bacher, Trent Adams, Leslie Sherman, Kate Eberwein, Olivia Johnson, lulia 
Deneva, Laura Ruocco, Ben Knowles, Aaron Warren, Jessica Warren, Gabe 
Lubell, Scott Fleming, Alex Burke, Colin Wilson, Charles Wisotzkey, Peter 
Robinson, Tom Ferguson, David Vollbach, Jenna Lemonias, Max Marcus, 
Rachel Wagner-Kaiser, Tim Taber, Max Fagin, and Claire Webb. 

I owe particular thanks to Jay Pasachoff, without whose constant encourage- 
ment and timely assistance this book would probably not exist. Likewise, Tom 
Balonek, who introduced me to CCD astronomy, has shared ideas, data, stu- 
dents, and friendship over many years. 1 am grateful as well to my astronomical 
colleagues in the Keck Northeast Astronomical Consortium; all provided crucial 
discussions on how to thrive as an astronomer at a small college, and many, like 
Tom and Jay, have read or used portions of the manuscript in their observational 
courses. All parts of the book have benefited from their feedback, i thank every 
Keckie, but especially Frank Winkler, Eric Jensen, Lee Hawkins, Karen Kwit- 
ter, Steve Sousa, Ed Moran, Bill Herbst, Kim McLeod, and Allyson Sheffield. 

Debra Elmegreen, my colleague at Vassar, collaborated with me on multiple 
research projects and on the notable enterprise of building a campus observa- 
tory. Much of our joint experience found its way into this volume. Vassar 
College, financially and communally, has been a superb environment for both 
my teaching and my practice of astronomy, and deserves my gratitude. My 
editors at Cambridge University Press have been uniformly helpful and skilled. 



XIV 



Preface 



My family and friends have had to bear some of the burden of this writing. 
Clara Bargellini and Gabriel Camera opened their home to me and my laptop 
during extended visits, and Ann Congelton supplied useful quotations and spir- 
ited discussions. I thank my children, Kate and Anthony, who gently remind me 
that what is best in life is not in a book. 

Finally, 1 thank my wife, Molly Shanley, for just about everything. 



Chapter 1 

Light 



Always the laws of light are the same, but the modes and degrees of seeing vary. 

— Henry David Thoreau, A Week on the Concord and Merrimack Rivers, 1 849 

Astronomy is not for the faint of heart. Almost everything it cares for is inde- 
scribably remote, tantalizingly untouchable, and invisible in the daytime, when 
most sensible people do their work. Nevertheless, many — including you, brave 
reader — have enough curiosity and courage to go about collecting the flimsy 
evidence that reaches us from the universe outside our atmosphere, and to hope 
it may hold a message. 

This chapter introduces you to astronomical evidence. Some evidence is in 
the form of material (like meteorites), but most is in the form of light from 
faraway objects. Accordingly, after a brief consideration of the material evi- 
dence, we will examine three theories for describing the behavior of light: light 
as a wave, light as a quantum entity called a photon, and light as a geometrical 
ray. The ray picture is simplest, and we use it to introduce some basic ideas like 
the apparent brightness of a source and how that varies with distance. Most 
information in astronomy, however, comes from the analysis of how brightness 
changes with wavelength, so we will next introduce the important idea of 
spectroscopy. We end with a discussion of the astronomical magnitude system. 
We begin, however, with a few thoughts on the nature of astronomy as an 
intellectual enterprise. 



1.1 The story 

... as I say, the world itself has changed. . . . For this is the great secret, which 
was known by all educated men in our day: that by what men think, we create the 
world around us, daily new. 

— Marion Zimmer Bradley, The Mists of Avalon, 1982 

Astronomers are storytellers. They spin tales of the universe and of its important 
parts. Sometimes they envision landscapes of another place, like the roiling 
liquid-metal core of the planet Jupiter. Sometimes they describe another time, 
like the era before Earth when dense buds of gas first flowered into stars, and a 



2 



Light 



darkening Universe filled with the sudden blooms of galaxies. Often the stories 
solve mysteries or illuminate something commonplace or account for something 
monstrous: How is it that stars shine, age, or explode? Some of the best stories 
tread the same ground as myth: What threw up the mountains of the Moon? How 
did the skin of our Earth come to teem with life? Sometimes there are fantasies: 
What would happen if a comet hit the Earth? Sometimes there are prophecies: 
How will the Universe end? 

Like all stories, creation of astronomical tales demands imagination. Like all 
storytellers, astronomers are restricted in their creations by many conventions of 
language as well as by the characters and plots already in the literature. Astron- 
omers are no less a product of their upbringing, heritage, and society than any 
other crafts people. Astronomers, however, think their stories are special, that 
they hold a larger dose of “truth” about the universe than any others. Clearly, 
the subject matter of astronomy — the Universe and its important parts — does not 
belong only to astronomers. Many others speak with authority about just these 
things: theologians, philosophers, and poets, for example. Is there some char- 
acteristic of astronomers, besides arrogance, that sets them apart from these 
others? Which story about the origin of the Moon, for example, is the truer: 
the astronomical story about a collision 4500 million years ago between the 
proto-Earth and a somewhat smaller proto-planet, or the mythological story 
about the birth of the Sumerian/Babylonian deity Nanna-Sin (a rather formidable 
fellow who had a beard of lapis-lazuli and rode a winged bull)? 

This question of which is the “truer” story is not an idle one. Over the 
centuries, people have discovered (by being proved wrong) that it is very diffi- 
cult to have a commonsense understanding of what the whole Universe and its 
most important parts are like. Common sense just isn’t up to the task. For that 
reason, as Morgan le Fey tells us in The Mists of Avalon, created stories about 
the Universe themselves actually create the Universe the listener lives in. The 
real Universe (like most scientists, you and I behave as if there is one) is not 
silent, but whispers very softly to the storytellers. Many whispers go unheard, so 
that real Universe is probably very different from the one you read about today 
in any book that claims to tell its story. People, nevertheless, must act. Most 
recognize that the bases for their actions are fallible stories, and they must 
therefore select the most trustworthy stories that they can find. 

Most of you won’t have to be convinced that it is better to talk about colliding 
planets than about Nanna-Sin if your aim is to understand the Moon or perhaps 
plan a visit. Still, it is useful to ask the question: What is it, if anything, that 
makes astronomical stories a more reliable basis for action, and in that sense 
more truthful or factual than any others? Only one thing, I think: discipline. 
Astronomers feel an obligation to tell their story with great care, following a 
rather strict, scientific, discipline. 

Scientists, philosophers, and sociologists have written about what it is that 
makes science different from other human endeavors. There is much discussion 



1.2 The evidence: astronomical data 



3 



and disagreement about the necessity of making scientific stories “broad and 
deep and simple”, about the centrality of paradigms, the importance of predic- 
tions, the strength or relevance of motivations, and the inevitability of conformity 
to social norms and professional hierarchies. 

But most of this literature agrees on the perhaps obvious point that a scientist, 
in creating a story (scientists usually call them “theories”) about, say, the Moon, 
must pay a great deal of attention to all the relevant evidence. A scientist, unlike 
a science-fiction writer, may only fashion a theory that cannot be shown to 
violate that evidence. 

This is a book about how to identify and collect relevant evidence in astronomy. 



1.2 The evidence: astronomical data 

[Holmes said] I have no data yet. It is a capital mistake to theorize before one has 
data. Insensibly one begins to twist facts to suit theories, instead of theories to 
suit facts 

— Arthur Conan Doyle, The Adventures of Sherlock Holmes, 1892 

Facts are not pure and unsullied bits of infonnation; culture also influences what 
we see and how we see it. Theories moreover are not inexorable inductions from 
facts. The most creative theories are often imaginative visions imposed upon 
facts; . . . 

— Stephen Jay Gould, The Mismeasure of Man, 1981 

A few fortunate astronomers investigate cosmic rays or the Solar System. All 
other astronomers must construct stories about objects with which they can 
have no direct contact, things like stars and galaxies that can’t be manipulated, 
isolated, or made the subject of experiment. This sets astronomers apart from 
most other scientists, who can thump on, cut up, and pour chemicals over 
their objects of study. In this sense, astronomy is a lot more like paleontology 
than it is like physics. Trying to tell the story of a galaxy is like trying to 
reconstruct a dinosaur from bits of fossilized bone. We will never have the 
galaxy or dinosaur in our laboratory, and must do guesswork based on flimsy, 
secondhand evidence. To study any astronomical object we depend on interme- 
diaries, entities that travel from the objects to us. There are two categories 
of intermediaries — particles with mass, and those without. First briefly consider 
the massive particles, since detailed discussion of them is beyond the scope of 
this book. 

1.2.1 Particles with mass 

Cosmic rays are microscopic particles that arrive at Earth with extraordinarily 
high energies. Primary cosmic rays are mostly high-speed atomic nuclei, 
mainly hydrogen (84%) and helium (14%). The remainder consists of heavier 



4 



Light 



nuclei, electrons, and positrons. Some primary cosmic rays are produced in solar 
flares, but many, including those of highest energies, come from outside the 
Solar System. About 6000 cosmic rays strike each square meter of the Earth’s 
upper atmosphere every second. Since all these particles move at a large fraction 
of the speed of light, they carry a great deal of kinetic energy. A convenient unit 
for measuring particle energies is the electron volt (eV): 

1 eV= 1.602 X 1(T 19 joules 

Primary cosmic rays have energies ranging from 10 6 to 10 2 ° eV, with relative 
abundance declining with increasing energy. The mean energy is around 
10 GeV = 10 10 eV. At relativistic velocities, the relation between speed, v, and 
total energy, E, is 



me 2 

E— = 

\/l - v 2 /c 2 

Here m is the rest mass of the particle and c is the speed of light. For 
reference, the rest mass of the proton (actually, the product me 1 ) is 0.93 GeV. 
The highest-energy cosmic rays have energies far greater than any attainable in 
laboratory particle accelerators. Although supernova explosions are suspected 
to be the source of some or all of the higher-energy primary cosmic rays, the 
exact mechanism for their production remains mysterious. 

Secondary cosmic rays are particles produced by collisions between the 
primaries and particles in the upper atmosphere — generally more that 50 km 
above the surface. Total energy is conserved in the collision, so the kinetic 
energy of the primary can be converted into the rest-mass of new particles, 
and studies of the secondaries gives some information about the primaries. 
Typically, a cosmic-ray collision produces many fragments, including pieces 
of the target nucleus, individual nucleons, and electrons, as well as particles not 
present before the collision: positrons, gamma rays, and a variety of more 
unusual short-lived particles like kaons. In fact, cosmic-ray experiments were 
the first to detect pions, muons, and positrons. 

Detection of both primary and secondary cosmic rays relies on methods 
developed for laboratory particle physics. Detectors include cloud and spark 
chambers, Geiger and scintillation counters, flash tubes, and various solid-state 
devices. Detection of primaries requires placement of a detector above the bulk 
of the Earth’s atmosphere, and only secondary cosmic rays can be studied 
directly from the Earth’s surface. Since a shower of secondary particles gen- 
erally spreads over an area of many square kilometers by the time it reaches sea 
level, cosmic-ray studies often utilize arrays of detectors. Typical arrays consist 
of many tens or hundreds of individual detectors linked to a central coordinating 
computer. Even very dense arrays, however, can only sample a small fraction of 
the total number of secondaries in a shower. 



1.2 The evidence: astronomical data 



5 



Neutrinos are particles produced in nuclear reactions involving the weak 
nuclear force. They are believed to have tiny rest masses (the best measurements 
to date are uncertain but suggest something like 0.05 eV). They may very well 
be the most numerous particles in the Universe. Many theories predict intense 
production of neutrinos in the early stages of the Universe, and the nuclear 
reactions believed to power all stars produce a significant amount of energy 
in the form of neutrinos. In addition, on Earth, a flux of high-energy “atmos- 
pheric” neutrinos is generated in cosmic -ray secondary showers. 

Since neutrinos interact with ordinary matter only through the weak force, 
they can penetrate great distances through dense material. The Earth and the 
Sun, for example, are essentially transparent to them. Neutrinos can nonetheless 
be detected: the trick is to build a detector so massive that a significant number 
of neutrino reactions will occur within it. Further, the detector must also be 
shielded from secondary cosmic rays, which can masquerade as neutrinos. 
About a half-dozen such “neutrino telescopes” have been built underground. 

For example, the Super-Kamiokande instrument is a 50 000-ton tank of water 
located 1 km underground in a zinc mine 125 miles west of Tokyo. The water 
acts as both the target for neutrinos and as the detecting medium for the products 
of the neutrino reactions. Reaction products emit light observed and analyzed by 
photodetectors on the walls of the tank. 

Neutrinos have been detected unambiguously from only two astronomical 
objects: the Sun and a nearby supernova in the Large Magellanic Cloud, SN 
1987A. These are promising results. Observations of solar neutrinos, for exam- 
ple, provide an opportunity to test the details of theories of stellar structure and 
energy production. 

Meteorites are macroscopic samples of solid material derived primarily 
from our Solar System’s asteroid belt, although there are a few objects that 
originate from the surfaces of the Moon and Mars. Since they survive passage 
through the Earth’s atmosphere and collision with its surface, meteorites can be 
subjected to physical and chemical laboratory analysis. Some meteorites have 
remained virtually unchanged since the time of the formation of the Solar 
System, while others have endured various degrees of processing. All, however, 
provide precious clues about the origin, age, and history of the Solar System. For 
example, the age of the Solar System (4.56 Gyr) is computed from radioisotopic 
abundances in meteorites, and the inferred original high abundance of radio- 
active aluminum-26 in the oldest mineral inclusions in some meteorites suggests 
an association between a supernova, which would produce the isotope, and the 
events immediately preceding the formation of our planetary system. 

Exploration of the Solar System by human spacecraft began with landings 
on the Moon in the 1 960s and 1970s. Probes have returned samples — Apollo and 
Luna spacecraft brought back several hundred kilograms of rock from the Moon. 
Flumans and their mechanical surrogates have examined remote surfaces in situ. 
The many landers on Mars, the Venera craft on Venus, and the FI uy gens lander 



6 



Light 



on Titan, for example, made intrusive measurements and conducted controlled 
experiments. 

1.2.2 Massless particles 

Gravitons, theorized particles corresponding to gravity waves, have only been 
detected indirectly through the behavior of binary neutron stars. Graviton detec- 
tors designed to sense the local distortion of space-time caused by a passing 
gravity wave have been constructed, but have not yet detected waves from an 
astronomical source. 

Photons are particles of light that can interact with all astronomical 
objects . 1 Light, in the form of visible rays as well as invisible rays like radio 
and X-rays, has historically constituted the most important channel of astro- 
nomical information. This book is about using that channel to investigate the 
Universe. 



1.3 Models for the behavior of light 

Some (not astronomers!) regard astronomy as applied physics. There is some 
justification for this, since astronomers, to help tell some astronomical story, 
persistently drag out theories proposed by physicists. Physics and astronomy 
differ partly because astronomers are interested in telling the story of an 
object, whereas physicists are interested in uncovering the most fundamental 
rules of the material world. Astronomers tend to find physics useful but sterile; 
physicists tend to find astronomy messy and mired in detail. We now invoke 
physics, to ponder the question: how does light behave? More specifically, 
what properties of light are important in making meaningful astronomical 
observations and predictions? 

1.3.1 Electromagnetic waves 

... we may be allowed to infer, that homogeneous light, at certain equal 
distances in the direction of its motion, is possessed of opposite qualities, 
capable of neutralizing or destroying each other, and extinguishing the light, 
where they happen to be united; . . . 

— Thomas Young, Philosophical Transactions, The Bakerian Lecture, 1804 



1 Maybe not. There is strong evidence for the existence of very large quantities of ‘‘dark” matter in 
the Universe. This matter seems to exert gravitational force, but is the source of no detectable 
light. It is unclear whether the dark matter is normal stuff that is well hidden, or unusual stuff that 
can’t give off or absorb light. Even more striking is the evidence for the presence of “dark energy” 
— a pressure-like effect in space itself which contains energy whose mass equivalent is even 
greater than that of visible and dark matter combined. 



1.3 Models for the behavior of light 



7 



Electromagnetic waves are a model for the behavior of light which we 
know to be incorrect ( incomplete is perhaps a better term). Nevertheless, the 
wave theory of light describes much of its behavior with precision, and intro- 
duces a lot of vocabulary that you should master. Christian Huygens, 2 in his 
1678 book, Traite de la Lumiere, summarized his earlier findings that visible 
light is best regarded as a wave phenomena, and made the first serious argu- 
ments for this point of view. Isaac Newton, his younger contemporary, 
opposed Huygens’ wave hypothesis and argued that light was composed of 
tiny solid particles. 

A wave is a disturbance that propagates through space. If some property of 
the environment (say, the level of the water in your bathtub) is disturbed at one 
place (perhaps by a splash), a wave is present if that disturbance moves con- 
tinuously from place to place in the environment (ripples from one end of your 
bathtub to the other, for example). Material particles, like bullets or ping-pong 
balls, also propagate from place to place. Waves and particles share many 
characteristic behaviors — both can reflect , (change directions at an interface) 
refract (change speed in response to a change in the transmitting medium), and 
can carry energy from place to place. However, waves exhibit two characteristic 
behaviors not shared by particles: 

Diffraction — the ability to bend around obstacles. A water wave entering a narrow 
opening, for example, will travel not only in the “shadow” of the opening but 
will spread in all directions on the far side. 

Interference — an ability to combine with other waves in predictable ways. Two 
water waves can, for example, destructively interfere if they combine so that the 
troughs of one always coincide with the peaks of the other. 

Although Huygens knew that light exhibited the properties of diffraction and 
interference, he unfortunately did not discuss them in his book. Newton’s rep- 
utation was such that his view prevailed until the early part of the nineteenth 
century, when Thomas Young and Augustin Fresnel were able to show how 
Huygen’s wave idea could explain diffraction and interference. Soon the 
evidence for waves proved irresistible. 

Well-behaved waves exhibit certain measurable qualities — amplitude, 
wavelength, frequency, and wave speed — and physicists in the generation 
following Fresnel were able to measure these quantities for visible light waves. 
Since light was a wave, and since waves are disturbances that propagate, it was 



2 Huygens (1629—1695), a Dutch natural philosopher and major figure in seventeenth-century 
science, had an early interest in lens grinding. He discovered the rings of Saturn and its large 
satellite, Titan, in 1655—1656, with a refracting telescope of his manufacture. At about the same 
time, he invented the pendulum clock, and formulated a theory of elastic bodies. He developed his 
wave theory of light later in his career, after he moved from The Hague to the more cosmopolitan 
environment of Paris. Near the end of his life, he wrote a treatise on the possibility of extrater- 
restrial life. 



8 



Light 




B © _ W e 



(b) 




I / 




Fig. 1.1 Acceleration of an 
electron produces a wave, 
(a) Undisturbed atoms in a 
source (A) and a receiver 
(B). Each atom consists of 
an electron attached to a 
nucleus by some force, 
which we represent as a 
spring. In (b) of the figure, 
the source electron has 
been disturbed, and 
oscillates between 
positions (1) and (2). The 
electron at B experiences a 
force that changes from F q 
to F 2 in the course of A's 
oscillation. The difference, 
A F, is the amplitude of the 
changing part of the 
electric force seen by B. 



natural to ask: “What ‘stuff does a light wave disturb?” In one of the major 
triumphs of nineteenth century physics, James Clerk Maxwell proposed an 
answer in 1873. 

Maxwell (1831—1879), a Scot, is a major figure in the history of physics, 
comparable to Newton and Einstein. His doctoral thesis demonstrated that the 
rings of Saturn (discovered by Huygens) must be made of many small solid 
particles in order to be gravitationally stable. He conceived the kinetic theory of 
gases in 1 866 (Ludwig Boltzmann did similar work independently), and trans- 
formed thermodynamics into a science based on statistics rather than determin- 
ism. His most important achievement was the mathematical formulation of the 
laws of electricity and magnetism in the form of four partial differential equa- 
tions. Published in 1873, Maxwell’s equations completely accounted for sepa- 
rate electric and magnetic phenomena and also demonstrated the connection 
between the two forces. Maxwell’s work is the culmination of classical physics, 
and its limits led both to the theory of relativity and the theory of quantum 
mechanics. 

Maxwell proposed that light disturbs electric and magnetic fields. The 
following example illustrates his idea. 

Consider a single electron, electron A. It is attached to the rest of the atom by 
means of a spring, and is sitting still. (The spring is just a mechanical model for 
the electrostatic attraction that holds the electron to the nucleus.) This pair of 
charges, the negative electron and the positive ion, is a dipole. A second electron, 
electron B, is also attached to the rest of the atom by a spring, but this second 
dipole is at some distance from A. Electron A repels B, and B’s position in its 
atom is in part determined by the location of A. The two atoms are sketched in 
Figure 1.1a. Now to make a wave: set electron A vibrating on its spring. Electron 
B must respond to this vibration, since the force it feels is changing direction. It 
moves in a way that will echo the motion of A. Figure 1. lb shows the changing 
electric force on B as A moves through a cycle of its vibration. 

The disturbance of dipole A has propagated to B in a way that suggests a 
wave is operating. Electron B behaves like an object floating in your bathtub that 
moves in response to the rising and falling level of a water wave. 

In trying to imagine the actual thing that a vibrating dipole disturbs, you might 
envision the water in a bathtub, and imagine an entity that fills space continuously 
around the electrons, the way a fluid would, so a disturbance caused by moving 
one electron can propagate from place to place. The physicist Michael Faraday 3 



3 Michael Faraday (1791—1867), considered by many the greatest experimentalist in history, began 
his career as a bookbinder with minimal formal education. His amateur interest in chemistry led to 
a position in the laboratory of the renowned chemist, Sir Humphrey Davy, at the Royal Institution 
in London. Faraday continued work as a chemist for most of his productive life, but conducted an 
impressive series of experiments in electromagnetism in the period 1834—1855. His ideas, 
although largely rejected by physicists on the continent, eventually formed the empirical basis 
for Maxwell’s theory of electromagnetism. 





1.3 Models for the behavior of light 



9 



supplied the very useful idea of a field— an abstract entity (not a material fluid at 
all) created by charged particles that permeates space and gives other charged 
particles instructions about what force they should experience. In this conception, 
electron B consults the local field in order to decide how to move. Shaking 
(accelerating) the electron at A distorts the field in its vicinity, and this distortion 
propagates to vast distances, just like the ripples from a rock dropped into a calm 
and infinite ocean. 

The details of propagating a field disturbance turned out to be a little 
complicated. Hans Christian Oerstead and Andre Marie Ampere in 1820 had 
shown experimentally that a changing electric field, such as the one generated 
by an accelerated electron, produces a magnetic field. Acting on his intuition 
of an underlying unity in physical forces, Faraday performed experiments that 
confirmed his guess that a changing magnetic field must in turn generate an 
electric field. Maxwell had the genius to realize that his equations implied that 
the electric and magnetic field changes in a vibrating dipole would support one 
another, and produce a wave-like self-propagating disturbance. Change the 
electric field, and you thereby create a magnetic field, which then creates a 
different electric field, which creates a magnetic field, and so on, forever. 
Thus, it is proper to speak of the waves produced by an accelerated charged 
particle as electromagnetic. Figure 1.2 shows a schematic version of an elec- 
tromagnetic wave. The changes in the two fields, electric and magnetic, vary 
at right angles to one another and the direction of propagation is at right angles 
to both. 

Thus, a disturbance in the electric field does indeed seem to produce a wave. 
Is this electromagnetic wave the same thing as the light wave we see with our 
eyes? 




Fig. 1.2 A plane polarized 
electromagnetic wave. 
The electric and magnetic 
field strengths are drawn 
as vectors that vary in 
both space and time. The 
illustrated waves are said 
to be plane-polarized 
because all electric 
vectors are confined to 
the same plane. 



10 



Light 



From his four equations — the laws of electric and magnetic force — Maxwell 
derived the speed of any electromagnetic wave, which, in a vacuum, turned out 
to depend only on constants: 

c = (8/0-5 

Here £ and p are well-known constants that describe the strengths of the 
electric and magnetic forces. (They are, respectively, the electric permittivity 
and magnetic permeability of the vacuum.) When he entered the experimental 
values for e and /( in the above equation, Maxwell computed the electromagnetic 
wave speed, which turned out to be numerically identical to the speed of light, a 
quantity that had been experimentally measured with improving precision over 
the preceding century. This equality of the predicted speed of electromagnetic 
waves and the known speed of light really was quite a convincing argument that 
light waves and electromagnetic waves were the same thing. Maxwell had 
shown that three different entities, electricity, magnetism, and light, were really 
one. 

Other predictions based on Maxwell’s theory further strengthened this 
view of the nature of light. For one thing, one can note that for any well-behaved 
wave the speed of the wave is the product of its frequency and wavelength: 

c = Av 

There is only one speed that electromagnetic waves can have in a vacuum; 
therefore there should be a one-dimensional classification of electromagnetic 
waves (the electromagnetic spectrum). In this spectrum, each wave is charac- 
terized only by its particular wavelength (or frequency, which is just c/X). 
Table 1.1 gives the names for various portions or bands of the electromagnetic 
spectrum. 

Maxwell’s wave theory of light very accurately describes the way light 
behaves in many situations. In summary, the theory says: 



Table 1.1. The electromagnetic spectrum. Region boundaries are not well-defined, so there is some 
overlap. Subdivisions are based in part on distinct detection methods 



Band 


Wavelength range 


Frequency range 


Subdivisions (long /.-short /.) 


Radio 


>1 mm 


<300 GHz 


VLF-AM-VHF-UHF 


Microwave 


0.1 mm-3 cm 


100 MHz-3000 GHz 


Millimeter-submillimeter 


Infrared 


700 nm-1 mm 


3 X 10^-4 X 10 14 Hz 


Far-Middle-Near 


Visible 


300 nm-800 nm 


4 X 10 14 -1 X 10 15 Hz 


Red -Blue 


Ultraviolet 


10 nm-400 nm 


7 X 10 14 -3 X 10 16 Hz 


Near-Extreme 


X-ray 


0.001 nm-10 nm 


3 X 10 16 -3 X 10 20 Hz 


Soft -Hard 


Gamma ray 


<0.1 nm 


>3 X 10 18 Hz 


Soft -Hard 



1.3 Models for the behavior of light 



11 



1 . Light exhibits all the properties of classical, well-behaved waves, namely: 

• reflection at interfaces 

• refraction upon changes in the medium 

• diffraction around edges 

• interference with other light waves 

• polarization in a particular direction (plane of vibration of the electric vector). 

2. A light wave can have any wavelength, selected from a range from zero to infinity. 
The range of possible wavelengths constitutes the electromagnetic spectrum. 

3. A light wave travels in a straight line at speed c in a vacuum. Travel in other media is 
slower and subject to refraction and absorption. 

4. A light wave carries energy whose magnitude depends on the squares of the ampli- 
tudes of the electric and magnetic waves. 

In 1873, Maxwell predicted the existence of electromagnetic waves outside 
the visible range and by 1888, Heinrich Hertz had demonstrated the production 
of radio waves based on Maxwell’s principles. Radio waves traveled at the 
speed of light and exhibited all the familiar wave properties like reflection 
and interference. This experimental confirmation convinced physicists that 
Maxwell had really discovered the secret of light. Humanity had made a tre- 
mendous leap in understanding reality. This leap to new heights, however, soon 
revealed that Maxwell had discovered only a part of the secret. 

1.3.2 Quantum mechanics and light 

It is very important to know that light behaves like particles, especially for those of 
you who have gone to school, where you were probably told something about light 
behaving like waves. I’m telling you the way it does behave — like particles. 

-Richard Feynman, QED, 1985 

Towards the end of the nineteenth century, physicists realized that electromag- 
netic theory could not account for certain behaviors of light. The theory that 
eventually replaced it, quantum mechanics, tells a different story about light. 
In the quantum story, light possesses the properties of a particle as well as 
the wave-like properties described by Maxwell’s theory. Quantum mechanics 
insists that there are situations in which we cannot think of light as a wave, but 
must think of it as a collection of particles, like bullets shot out of the source at 
the speed of light. These particles are termed photons. Each photon “contains” 
a particular amount of energy, E, that depends on the frequency it possesses 
when it exhibits its wave-like properties: 



Here h is Planck’s constant (6.626 X 10 34 J s) and v is the frequency of 
the wave. Thus a single radio photon (low frequency) contains a small 



12 



Light 



amount of energy, and a single gamma-ray photon (high frequency) contains 
a lot. 

The quantum theory of light gives an elegant and successful picture of 
the interaction between light and matter. In this view, atoms no longer 
have electrons bound to nuclei by springs or (what is equivalent in classical 
physics) electric fields. Electrons in an atom have certain permitted energy 
states described by a wave function — in this theory, everything, including 
electrons, has a wave as well as a particle nature. The generation or absorption 
of light by atoms involves the electron changing from one of these permitted 
states to another. Energy is conserved because the energy lost when an atom 
makes the transition from a higher to a lower state is exactly matched by the 
energy of the photon emitted. In summary, the quantum mechanical theory says: 

1 . Light exhibits all the properties described in the wave theory in situations where wave 
properties are measured. 

2. Light behaves, in other circumstances, as if it were composed of massless particles 
called photons, each containing an amount of energy equal to its frequency times 
Planck’s constant. 

3. The interaction between light and matter involves creation and destruction of indi- 
vidual photons and the corresponding changes of energy states of charged particles 
(usually electrons). 

We will make great use of the quantum theory in later chapters, but for now 
our needs are more modest. 

1.3.3 A geometric approximation: light rays 

Since the quantum picture of light is as close as we can get to the real nature of 
light, you might think quantum mechanics would be the only theory worth 
considering. However, except in simple situations, application of the theory 
demands complex and lengthy computation. Fortunately, it is often possible 
to ignore much of what we know about light, and use a very rudimentary picture 
which pays attention to only those few properties of light necessary to under- 
stand much of the information brought to us by photons from out there. In this 
geometric approximation, we treat light as if it traveled in “rays” or streams that 
obey the laws of reflection and refraction as described by geometrical optics. It 
is sometimes helpful to imagine a collection of such rays as the paths taken by a 
stream of photons. 

Exactly how we picture a stream of photons will vary from case to case. 
Sometimes it is essential to recognize the discrete nature of the particles. In this 
case, think of the light ray as the straight path that a photon follows. We might 
then think of astronomical measurements as acts of counting and classifying 
the individual photons as they hit our detector like sparse raindrops tapping on a 
tin roof. 



1.4 Measurements of light rays 



13 



On the other hand, there will be circumstances where it is profitable to ignore 
the lumpy nature of the photon stream — to assume it contains so many photons 
that the stream behaves like a smooth fluid. In this case, we think of astronom- 
ical measurements as recording smoothly varying quantities. In this case it is 
like measuring the volume of rain that falls on Spain in a year: we might be 
aware that the rain arrived as discrete drops, but it is safe to ignore the fact. 

We will adopt this simplified ray picture for much of the discussion that 
follows, adjusting our awareness of the discreet nature of the photon stream or 
its wave properties as circumstances warrant. For the rest of this chapter, we use 
the ray picture to discuss two of the basic measurements important in astronomy: 

photometry measures the amount of energy arriving from a source; 
spectrometry measures the distribution of photons with wavelength. 

Besides photometry and spectroscopy, the other general categories of measure- 
ment are imaging and astrometry, which are concerned with the appearance and 
positions of objects in the sky; and polarimetry, which is concerned with the 
polarization of light from the source. Incidentally, the word “wavelength” does 
not mean we are going to think deeply about the wave theory just yet. It will be 
sufficient to think of wavelength as a property of a light ray that can be measured — 
by where the photon winds up when sent through a spectrograph, for example. 



1.4 Measurements of light rays 

Twinkle, twinkle, little star, 

Flux says just how bright you are. 

— Anonymous, c. 1980 



1.4.1 Luminosity and brightness 

Astronomers have to construct the story of a distant object using only the tiny 
whisper of electromagnetic radiation it sends us. We define the (electromag- 
netic) luminosity, L, as the total amount of energy that leaves the surface of the 
source per unit time in the form of photons. Energy per unit time is called power, 
so we can measure L in physicists’ units for power (SI units): joules per second 
or watts. Alternatively, it might be useful to compare the object with the Sun, 
and we then might measure the luminosity in solar units: 

L = Luminosity = Energy per unit time emitted by the entire source 

Lq = Luminosity of the Sun = 3.825 X 10 26 W 

The luminosity of a source is an important clue about its nature. One way to 
measure luminosity is to surround the source completely with a box or bag of 
perfect energy-absorbing material, then use an “energy gauge” to measure the 



14 



Light 



Fig. 1.3 Measuring 
luminosity by 
intercepting all the power 
from a source. 



(a) 



Source 

o — 



Astronomer 




Source in a box 
with an energy 
gauge 



total amount of energy intercepted by this enclosure during some time interval. 
Figure 1.3 illustrates the method. Luminosity is the amount of energy absorbed 
divided by the time interval over which the energy accumulates. The astrono- 
mer, however, cannot measure luminosity in this way. She is too distant from 
the source to put it in a box, even in the unlikely case she has a big enough box. 

Fortunately, there is a quantity related to luminosity, called the apparent 
brightness of the source, which is much easier to measure. 

Measuring apparent brightness is a local operation. The astronomer holds up 
a scrap of perfectly absorbing material of known area so that its surface is 
perpendicular to the line of sight to the source. She measures how much energy 
from the source accumulates in this material in a known time interval. Apparent 
brightness, F, is defined as the total energy per unit time per unit area that arrives 
from the source: 



F = 



E 

tA 



This quantity F is usually known as the flux or the flux density in the 
astronomical literature. In the physics literature, the same quantity is usually 
called the irradiance (or, in studies restricted to visual light, the illuminance.) 
To make matters not only complex but also confusing, what astronomers call 
luminosity, L, physicists call the radiant flux. 

Whatever one calls it, F will have units of power per unit area, or Wm 2 . For 
example, the average flux from the Sun at the top of the Earth’s atmosphere (the 
apparent brightness of the Sun) is about 1370 W m 2 , a quantity known as the 
solar constant. 



1.4.2 The inverse square law of brightness 

Refer to Figure 1 .4 to derive the relationship between the flux from a source and 
the source’s luminosity. We choose to determine the flux by measuring the 



1.4 Measurements of light rays 



15 




Fig. 1.4 Measuring the 
apparent brightness of a 
source that is at distance r. 
The astronomer locally 
detects the power 
reaching a unit area 
perpendicular to the 
direction of the source. 

If the source is isotropic 
and there is no 
intervening absorber, 
then its luminosity is 
equal to the apparent 
brightness multiplied by 
the area of a sphere of 
radius r. 



power intercepted by the surface of a very large sphere of radius r centered on 
the source. The astronomer is on the surface of this sphere. Since this surface is 
everywhere perpendicular to the line of sight to the source, this is an acceptable 
way to measure brightness: simply divide the intercepted power by the total area 
of the sphere. But surrounding the source with a sphere is also like putting it into 
a very large box, as in Figure 1.3. The total power absorbed by the large sphere 
must be the luminosity, L, of the source. We assume that there is nothing located 
between the source and the spherical surface that absorbs light — no dark cloud 
or planet. The brightness, averaged over the whole sphere, then, is 



Now make the additional assumption that the radiation from the source is iso- 
tropic (the same in all directions). Then the average brightness is the same as the 
brightness measured locally, using a small surface: 




Both assumptions, isotropy and the absence of absorption, can be violated in 
reality. Nevertheless, in its simple form, Equation (1.1) not only represents one 
of the fundamental relationships in astronomy, it also reveals one of the central 
problems in our science. 

The problem is that the left-hand side of Equation (1.1) is a quantity that can 
be determined by direct observation, so potentially could be known to great 
accuracy. The expression on the right-hand side, however, contains two unknowns, 
luminosity and distance. Without further information these two cannot be dis- 
entangled. This is a frustration — you can’t say, for example, how much power a 
quasar is producing without knowing its distance, and you can’t know its 



16 



Light 



distance unless you know how much power it is producing. A fundamental 
problem in astronomy is determining the third dimension. 



1.4.3 Surface brightness 

One observable quantity that does not depend on the distance of a source is its 
surface brightness on the sky. Consider the simple case of a uniform spherical 
source of radius a and luminosity L. On the surface of the source, the amount of 
power leaving a unit area is 

L 

4na 2 

Note that s, the power emitted per unit surface area of the source, has the same 
dimensions, W m 2 , as F, the apparent brightness seen by a distant observer. The 
two are very different quantities, however. The value of s is characteristic only of 
the object itself, whereas F changes with distance. Now, suppose that the object 
has a detectable angular size — that our telescope can resolve it, that is, distinguish 
it from a point source. The solid angle , in steradians, subtended by a spherical 
source of radius a and distance r is (for a -C r) 



7 IQ 

£2 = , [steradians] 

r L 



Now we write down er, the apparent surface brightness of the source on the 
sky, that is, the flux that arrives per unit solid angle of the source. In our 
example, it is both easily measured as well as independent of distance: 

F s 



A more careful analysis of non-spherical, non-uniform emitting surfaces 
supports the conclusion that, for objects that can be resolved, a is invariant with 
distance. Ordinary optical telescopes can measure Q with accuracy only if it has 
a value larger than a few square seconds of arc (about 10~'° steradians), mainly 
because of turbulence in the Earth’s atmosphere. Space telescopes and ground- 
based systems with adaptive optics can resolve solid angles perhaps a hundred 
times smaller. Unfortunately, the majority of even the nearest stars have angular 
sizes too small (diameters of a few milli-arcsec) to resolve with present instru- 
ments, so that for them Q (and therefore a) cannot be measured directly. 
Astronomers do routinely measure a for “extended objects” like planets, 
gaseous nebulae, and galaxies and find these values immensely useful. 



1.5 Spectra 

If a question on an astronomy exam starts with the phrase “how do we know ...” 
then the answer is probably “spectrometry”. 



— Anonymous, c. 1950 



1.5 Spectra 



17 



Astronomers usually learn most about a source not from its flux, surface 
brightness, or luminosity, but from its spectrum: the way in which light is 
distributed with wavelength. Measuring its luminosity is like glancing at the 
cover of a book about the source — with luck, you might read the title. Meas- 
uring its spectrum is like opening the book and skimming a few chapters, 
chapters that might explain the source’s chemical composition, pressure, den- 
sity, temperature, rotation speed, or radial velocity. Although evidence in 
astronomy is usually meager, the most satisfying and eloquent evidence is 
spectroscopic. 



1.5.1 Monochromatic flux 

Consider measuring the flux from of a source in the usual fashion, with a set-up 
like the one in Figure 1.4. Arrange our perfect absorber to absorb only photons 
that have frequencies between v and v + dv, where dv is an infinitesimally small 
frequency interval. Write the result of this measurement as F(y, v + dv). As with 
all infinitesimal quantities, you should keep in mind that dv and F(y, v + dv) are 
the limits of finite quantities called Av and F(v, v + Av). We then define mono- 
chromatic flux or monochromatic brightness as 



/v 



F(v, v + dv) 
dv 



F(v, v + Av) 

limit — 

Av^O Av 



( 1 . 2 ) 



The complete function, f v , running over all frequencies (or even over a 
limited range of frequencies), is called the spectrum of the object. It has units 
[W m -2 Hz -1 ]. The extreme right-hand side of Equation ( 1 .2) reminds us that/], 
is the limiting value of the ratio as the quantity Av [and correspondingly 
F(v,v + Av)] become indefinitely small. For measurement, Av must have a 
finite size, since F(y, v + dv) must be large enough to register on a detector. 
If Av is large, the detailed wiggles and jumps in the spectrum will be smoothed 
out, and one is said to have measured a low-resolution spectrum. Likewise, a 
high-resolution spectrum will more faithfully show the details of the limiting 
function, f v . 

If we choose the wavelength as the important characteristic of light, we can 
define a different monochromatic brightness, /). Symbolize the flux between 
wavelengths 1 and a + d/ as F{k, X + dl) and write 

, _ F{X,X + dX) 
fx "dX 

Although the functions f, and/) are each called the spectrum, they differ from 
one another in numerical value and overall appearance for the same object. 
Figure 1.5 shows schematic low-resolution spectra of the bright star, Vega, 
plotted over the same range of wavelengths, first as/), then as f v . 



18 



Light 



Fig. 1.5 Two forms of the 
ultraviolet and visible 
outside-the-atmosphere 
spectrum of Vega. The 
two curves convey the 
same information, but 
have very different 
shapes. Units on the 
vertical axes are 
arbitrary. 





1.5.2 Flux within a band 

Less is more. 

— Robert Browning, Andrea del Sarto (1855), often quoted by L. Mies van der Rohe 

An ideal bolometer is a detector that responds to all wavelengths with perfect 
efficiency. In a unit time, a bolometer would record every photon reaching it 
from a source, regardless of wavelength. We could symbolize the bolometric 
flux thereby recorded as the integral: 



Fb oi= fldl 



Real bolometers operate by monitoring the temperature of a highly absorbing 
(i.e. black) object of low thermal mass. They are imperfect in part because it is 
difficult to design an object that is “black” at all wavelengths. More commonly, 
practical instruments for measuring brightness can only detect light within a 
limited range of wavelengths or frequencies. Suppose a detector registers light 
between wavelengths and 1 2 , and nothing outside this range. In the notation 
of the previous section, we might then write the flux in the 1,2 pass-band as: 



F(X = 



fxdX=F(v 2 ,v\) = 



fdv 



Usually, the situation is even more complex. In addition to having a sensitivity 
to light that “cuts on” at one wavelength, and “cuts off” at another, practical 
detectors vary in detecting efficiency over this range. If R A (2) is the fraction of 
the incident flux of wavelength 1 that is eventually detected by instrument A, 
then the flux actually recorded by such a system might be represented as 



A a = 



*a(A)/a<U 



0 



(1.3) 



1.5 Spectra 



19 



The function R A (X) may be imposed in part by the environment rather than by 
the instrument. The Earth’s atmosphere, for example, is (imperfectly) transpar- 
ent only in the visible and near-infrared pass-band between about 0.32 and 1 
micron (extending in restricted bands to 25 pm at very dry, high-altitude sites), 
and in the microwave— radio pass-band between about 0.5 millimeters and 50 
meters. 

Astronomers often intentionally restrict the range of a detector’s sensitivity, 
and employ filters to control the form of the efficiency function, R A , particularly 
to define cut-on and cut-off wavelengths. Standard filters are useful for several 
reasons. First, a well-defined band makes it easier for different astronomers to 
compare measurements. Second, a filter can block troublesome wavelengths, 
ones where the background is bright, perhaps, or where atmospheric transmis- 
sion is low. Finally, comparison of two or more different band-pass fluxes for 
the same source is akin to measuring a very low-resolution spectrum, and thus 
can provide some of the information, like temperature or chemical composition, 
that is conveyed by its spectrum. 

Flundreds of bands have found use in astronomy. Table 1.2 lists the 
broad band filters (i.e. filters where the bandwidth, A2 = 1 2 — M , is large) 
that are most commonly encountered in the visible— near-infrared window. 
Standardization of bands is less common in radio and high-energy observa- 
tions. 



Table 1.2. Common broad band-passes in the visible (UBVRI), near- 
infrared (JHKLM), and mid-infrared (NQ). Chapter 10 discusses standard 
bands in greater detail 



Name 


X c (pm) 


Width (pm) 


Rationale 


U 


0.365 


0.068 


Ultraviolet 


B 


0.44 


0.098 


Blue 


V 


0.55 


0.089 


Visual 


R 


0.70 


0.22 


Red 


1 


0.90 


0.24 


Infrared 


J 


1.25 


0.38 




H 


1.63 


0.31 




K 


2.2 


0.48 




L 


3.4 


0.70 




M 


5.0 


1.123 




N 


10.2 


4.31 




Q 


21.0 


8 





20 



Light 



1.5.3 Spectrum analysis 

[With regard to stars] ... we would never know how to study by any means their 
chemical composition ... In a word, our positive knowledge with respect to stars 
is necessarily limited solely to geometrical and mechanical phenomena . . . 

— Auguste Comte, Cours de Philosophic Positive II, 19th Lesson, 1835 

... I made some observations which disclose an unexpected explanation of the 
origin of Fraunhofer’s lines, and authorize conclusions therefrom respecting 
the material constitution of the atmosphere of the sun, and perhaps also of that of 
the brighter fixed stars. 

— Gustav R. Kirchhoff, Letter to the Academy of Science at Berlin, 1859 

Astronomers are fond of juxtaposing Comte’s pronouncement about the impos- 
sibility of knowing the chemistry of stars with Kirchhoff s breakthrough a 
generation later. Me too. Comte deserves better, since he wrote quite thought- 
fully about the philosophy of science, and would certainly have been among the 
first to applaud the powerful new techniques of spectrum analysis developed later 
in the century. Nevertheless, the failure of his dictum about what is knowable is a 
caution against pomposity for all. 

Had science been quicker to investigate spectra, Comte might have been 
spared posthumous deflation. In 1666, Newton observed the dispersion of 
visible “white” sunlight into its component colors by glass prisms, but subse- 
quent applications of spectroscopy were very slow to develop. It was not until 
the start of the nineteenth century that William Herschel and Johann Wilhelm 
Ritter used spectrometers to demonstrate the presence of invisible electromag- 
netic waves beyond the red and violet edges of the visible spectrum. The English 
physicist William Wollaston, in 1802, noted the presence of dark lines in the 
visible solar spectrum. 

The Fraunhofer spectrum. Shortly thereafter (c. 1812), Joseph von Fraunhofer 
(1787—1826), using a much superior spectroscope, unaware of Wollaston’s work, 
produced an extensive map of the solar absorption lines. 

Of humble birth, Fraunhofer began his career as an apprentice at a glass- 
making factory located in an abandoned monastery in Benediktbeuern, out- 
side Munich. By talent and fate (he survived a serious industrial accident) he 
advanced quickly in the firm. The business became quite successful and 
famous because of a secret process for making large blanks of high-quality 
crown and flint glass, which had important military and civil uses. Observing 
the solar spectrum with the ultimate goal of improving optical instruments, 
Fraunhofer pursued what he believed to be his discovery of solar absorption 
lines with characteristic enthusiasm and thoroughness. By 1814, he had given 
precise positions for 350 lines and approximate positions for another 225 
fainter lines (see Figure 1.6). By 1823, Fraunhofer was reporting on the 
spectra of bright stars and planets, although his main attention was devoted 



1.5 Spectra 



21 




.2 'u £&att/tbA 0 &r*i , /(/'fi. tfJt uA'ir hi* fSt£ — f5. 



Fig. 1.6 A much-reduced 
reproduction of one of 
Fraunhofer's drawings of 
the solar spectrum. 
Frequency increases to 
the right, and the 
stronger absorption lines 
are labeled with his 
designations. Modern 
designations differ 
slightly. 



to producing high-quality optical instruments. Fraunhofer died from tuber- 
culosis at the age of 39. One can only speculate on the development of 
astrophysics had Fraunhofer been able to remain active for another quarter 
century. 

Fraunhofer’s lines are narrow wavelength bands where the value of function 
f. drops almost discontinuously, then rises back to the previous “continuum” 
(see Figures 1.6 and 1.7). The term “line” arises because in visual spectro- 
scopy one actually examines the image of a narrow slit at each frequency. If the 
intensity is unusually low at a particular frequency, then the image of the slit 
there looks like a dark line. Fraunhofer designated the ten most prominent of 
these dark lines in the solar spectrum with letters (Figure 1.6, and Appendix 
B2). He noted that the two dark lines he labeled with the letter D occurred at 
wavelengths identical to the two bright emission lines produced by a candle 
flame. (Emission lines are narrow wavelength regions where the value of func- 
tion fx increases almost discontinuously, then drops back down to the contin- 
uum; see Figure 1 .7.) Soon several observers noted that the bright D lines, which 
occur in the yellow part of the spectrum at wavelengths of 589.0 and 589.6 
nanometers, always arise from the presence of sodium in a flame. At about this 
same time, still others noted that heated solids, unlike the gases in flames, 
produce continuous spectra (no bright or dark lines — again, see Figure 1.7). 
Several researchers (John Herschel, William Henry Fox Talbot, David Brewster) 
in the 1820s and 1830s suggested that there was a connection between the 
composition of an object and its spectrum, but none could describe it precisely. 

The Kirchhoff— Bunsen results. Spectroscopy languished for the thirty years 
following Fraunhofer’s death in 1826. Then, in Heidelberg in 1859, physicist 
Gustav Kirchhoff and chemist Robert Bunsen performed a crucial experiment. 
They passed a beam of sunlight through a sodium flame, perhaps expecting the 
dark D lines to be filled in by the bright lines from the flame in the resulting 
spectrum. What they observed instead was that the dark D lines became darker 



22 



Light 



Fig. 1.7 Three types of 
spectra and the situations 
that produce them, (a) A 
solid, liquid, or dense gas 
produces a continuous 
spectrum, (b) An 
emission-line spectrum is 
produced by rarefied gas 
like a flame or a spark. 

(c) An absorption-line 
spectrum is produced 
when a source with a 
continuous spectrum 
is viewed through a 
rarefied gas. 




Emission- 

line 




(b) 


spectrum 

I J 


.. it J 






Spectroscope Hot gas 



Hot, dense 
source 

O 

Spectroscope Hot gas 



still. Kirchhoff reasoned that the hot gas in the flame had both absorbing and 
emitting properties at the D wavelengths, but that the absorption became more 
apparent as more light to be absorbed was supplied, whereas the emitting prop- 
erties remained constant. This suggested that absorption lines would always be 
seen in situations like that sketched on the right of Figure 1.7c, so long as a 
sufficiently bright source was observed through a gas. If the background source 
were too weak or altogether absent, then the situation sketched in Figure 1.7b 
would hold and emission lines would appear. 

Kirchhoff, moreover, proposed an explanation of all the Fraunhofer lines: 
The Sun consists of a bright source — a very dense gas, as it turns out — that 
emits a continuous spectrum. A low-density, gaseous atmosphere surrounds 
this dense region. As the light from the continuous source passes through the 
atmosphere, the atmosphere absorbs those wavelengths characteristic of its 
chemical composition. One could then conclude, for example, that the 
Fraunhofer D lines demonstrated the presence of sodium in the solar atmos- 
phere. Identification of other chemicals in the solar atmosphere became a 
matter of obtaining an emission-line “fingerprint” from a laboratory flame 
or spark spectrum, then searching for the corresponding absorption line or 
lines at identical wavelengths in the Fraunhofer spectrum. Kirchhoff and 
Bunsen quickly confirmed the presence of potassium, iron, and calcium; and 
the absence (or very low abundance) of lithium in the solar atmosphere. The 
Kirchhoff— Bunsen results were not limited to the Sun. The spectra of almost 



1.5 Spectra 



23 



all stars turned out to be absorption spectra, and it was easy to identify many of 
the lines present. 

Quantitative chemical analysis of solar and stellar atmospheres became 
possible in the 1940s, after the development of astrophysics in the early twen- 
tieth century. At that time astronomers showed that most stars were composed of 
hydrogen and helium in a roughly 12 to 1 ratio by number, with small additions 
of other elements. However, the early qualitative results of Kirchhoff and 
Bunsen had already demonstrated to the world that stars were made of ordinary 
matter, and that one could hope to learn their exact composition by spectrom- 
etry. By the 1860s they had replaced the “truth” that stars were inherently 
unknowable with the new “truth” that stars were made of ordinary stuff. 

Blackbody spectra. In 1860, Kirchhoff discussed the ratio of absorption to 
emission in hot objects by first considering the behavior of a perfect absorber, an 
object that would absorb all light falling on its surface. He called such an object 
a blackbody, since it by definition would reflect nothing. Blackbodies, however, 
must emit (otherwise their temperatures would always increase from absorbing 
ambient radiation). One can construct a simple blackbody by drilling a small 
hole into a uniform oven. The hole is the blackbody. The black walls of the oven 
will always absorb light entering the hole, so that the hole is a perfect absorber. 
The spectrum of the light emitted by the hole will depend on the temperature of 
the oven (and, it turns out, on nothing else). The blackbody spectrum is usually a 
good approximation to the spectrum emitted by any solid, liquid, or dense gas. 
(Low-density gases produce line spectra.) 

In 1878, Josef Stefan found experimentally that the surface brightness of a 
blackbody (total power emitted per unit area) depends only on the fourth power 
of its temperature, and, in 1884, Ludwig Boltzmann supplied a theoretical 
understanding of this relation. The Stefan— Boltzmann law is 

s = oT a 

where a = the Stefan— Boltzmann constant = 5.6696 X 1 0 x W m~ 2 K~ 4 . 

Laboratory studies of blackbodies at about this time showed that although their 
spectra change with temperature, all have a similar shape: a smooth curve with 
one maximum (see Figure 1.8). This peak in the monochromatic flux curve (either 
fx or f v ) shifts to shorter wavelengths with increasing temperature, following a 
relation called Wein’s displacement law (1893). Wein’s law states that for /; 

TJ-max = 2.8979 X 1(T 3 m-K 

or equivalently for f, 



T 



= 1.7344 X 10“' 1 ILT 1 K 



^MAX 



24 



Light 



Fig. 1.8 Blackbody 
spectra B(X,T) for objects 
at three different surface 
temperatures. The 
surface temperature of 
the Sun is near 6000 K. 




Wavelength f fl m j 



Thus blackbodies are never black: the color of a glowing objects shifts from red 
to yellow to blue as it is heated. 

Max Planck presented the actual functional form for the blackbody spec- 
trum at a Physical Society meeting in Berlin in 1900. His subsequent attempt 
to supply a theoretical understanding of the empirical “ Planck function ” led 
him to introduce the quantum hypothesis — that energy can only radiate in 
discrete packets. Later work by Einstein and Bohr eventually showed the 
significance of this hypothesis as a fundamental principle of quantum 
mechanics. 

The Planck function gives the specific intensity, that is, the monochromatic 
flux per unit solid angle, usually symbolized as B(v, T) or B r (T). The total power 
emitted by a unit surface blackbody over all angles is just s(v, T) = n B(v, T). The 
Planck function is 



B(v, T) 



2 hv 3 1 

c 2 (exp(f£)-l) 



B(fT) 



2 he 2 1 

(exp(sf) - 1) 



For astronomers, the Planck function is especially significant because it shows 
exactly how the underlying continuous spectrum of a dense object depends on 
temperature and wavelength. Since the shape of the spectrum is observable, this 
means that one can deduce the temperature of any object with a “Planck-like” 
spectrum. Figure 1.8 shows the Planck function for several temperatures. Note 
that even if the position of the peak of the spectrum cannot be observed, the 
slope of the spectrum gives a measure of the temperature. 



1.5 Spectra 



25 



At long wavelengths, it is useful to note the Rayleigh— Jeans approximation 
to the tail of the Planck function: 



2 ckT 

5(2, T) - 4 — 



B(v,T) = 2kT — 
c z 



1.5.4 Spectra of stars 

When astronomers first examined the absorption line spectra of large numbers 
of stars other than the Sun, they did not fully understand what they saw. Flooded 
with puzzling but presumably significant observations, most scientists have the 
(good) impulse to look for similarities and patterns: to sort the large number of 
observations into a small number of classes. Astronomers did their initial sorting 
of spectra into classes on the basis of the overall simplicity of the pattern of 
lines, assigning the simplest to class A, next simplest to B and so on through the 
alphabet. Only after a great number of stars had been so classified from photo- 
graphs ’ (in the production of the Henry Draper Catalog — see Chapter 4) did 
astronomers come to understand, through the new science of astrophysics, that a 
great variety in stellar spectra arises mainly from temperature differences. 

There is an important secondary effect due to surface gravity, as well as some 
subtle effects due to variations in chemical abundance. The chemical differ- 
ences usually involve only the minor constituents — the elements other than 
hydrogen and helium. 

The spectral type of a star, then, is basically an indication of its effective 
temperature — that is, the temperature of the blackbody that would produce the 
same amount of radiation per unit surface area as the star does. If the spectral 
type is sufficiently precise, it might also indicate the surface gravity or relative 
diameter or luminosity (if two stars have the same temperature and mass, the 
one with the larger diameter has the lower surface gravity as well as the higher 
luminosity). For reference, Table 1.3 lists the modem spectral classes and cor- 
responding effective temperatures. The spectral type of a star consists of three 
designations: (a) a letter, indicating the general temperature class, (b) a decimal 
subclass number between 0 and 9 refining the temperature estimate (0 indicating 
the hottest and 9.9 the coolest subclass), and (c) a Roman numeral indicating the 
relative surface gravity or luminosity class. Luminosity class I (the supergiants ) 
is most luminous, Iff (the giants) is intermediate, and V (the dwarves) is least 

4 Antonia Maury suggested in 1 897 that the correct sequence of types should be O through M (the 
first seven in Table 1.3 — although Maury used a different notation scheme). Annie Cannon, in 
1901, justified this order on the basis of continuity, not temperature. Cannon’s system solidified 
the modem notation and was quickly adopted by the astronomical community. In 1921, Megh Nad 
Saha used atomic theory to explain the Cannon sequence as one of stellar temperature. 



26 



Light 



Table 1.3. Modern spectral classes in order of decreasing temperature. The L and T classes are 
recent additions. Some objects of class L and all of class T are not true stars but brown dwarves. 
Temperatures marked with a colon are uncertain 



Type 


Temperature range, K 


Main characteristic of absorption line spectra 


0 


>30 000 


Ionized He lines 


B 


30 000-9800 


Neutral He lines, strengthening neutral H 


A 


9800-7200 


Strong neutral H, weak ionized metals 


F 


7200-6000 


H weaker, ionized Ca strong, strong ionized and neutral metals 


G 


6000-5200 


Ionized Ca strong, very strong neutral and ionized metals 


K 


5200-3900 


Very strong neutral metals, CH and CN bands 


M 


3900-2100: 


Strong TiO bands, some neutral Ca 


L 


2 1 00:— 1 500: 


Strong metal hydride molecules, neutral Na, K, Cs 


T 


<1500: 


Methane bands, neutral K, weak water 



luminous. Dwarves are by far the most common luminosity class. The Sun, for 
example, has spectral type G2 V. 

There are some rarely observed spectral types that are not listed in Table 1.3. 
A few exhibit unusual chemical abundances, like carbon stars (spectral types R 
and N) and Wolf—Rayet stars (type W) and stars with strong zirconium oxide 
bands (type S). White dwarves (spectral types DA, DB, and several others) are 
small, dense objects of low luminosity that have completely exhausted their 
supply of nuclear fuel and represent endpoints of stellar evolution. Brown 
dwarves, which have spectral types L and T, are stars that do not have sufficient 
mass to initiate full-scale thermonuclear reactions. We observe relatively few 
white dwarves and stars in classes L and T because their low luminosities make 
them hard to detect. All three groups are probably quite abundant in the galaxy. 
Stars of types W, R, N, and S, in contrast, are luminous and intrinsically rare. 



1.6 Magnitudes 

1.6.1 Apparent magnitudes 

When Hipparchus of Rhodes (c. 190-120 BC), arguably the greatest astronomer 
in the Hellenistic school, published his catalog of 600 stars, he included an 
estimate of the brightness of each — our quantity F. Strictly, what Hipparchus 
and all visual observers estimate is F vis , the flux in the visual band-pass, the band 
corresponding to the response of the human eye. The eye has two different 
response functions, corresponding to two different types of receptor cells — rods 
and cones. At high levels of illumination, only the cones operate ( photopic 
vision), and the eye is relatively sensitive to red light. At low light levels 



1.6 Magnitudes 



27 



(scotopic vision) only the rods operate, and sensitivity shifts to the blue. 
Greatest sensitivity is at about 555 nm (yellow) for cones and 505 nm (green) 
for rods. Except for extreme red wavelengths, scotopic vision is more sensitive 
than photopic and most closely corresponds to the Hipparchus system. See 
Appendix B3. 

Hipparchus cataloged brightness by assigning each star to one of six classes, 
the first class (or first magnitude) being the brightest, the sixth class the faintest. 
The choice of six classes, rather than some other number — ten, for example — is 
a curious one, and may be tied to earlier Babylonian mysticism, which held six 
to be a significant number. For the next two millennia, astronomers perpetuated 
this system, eventually extending it to fainter stars at higher magnitudes: mag- 
nitudes 7, 8, 9, etc. could only be seen with a telescope. With the introduction of 
photometers in the nineteenth century, William Pogson (c. AD 1856) discovered 
that Hipparchus’s classes were in fact approximately a geometric progression in 
F, with each class two or three times fainter than the preceding. Pogson pro- 
posed regularizing the system so that a magnitude difference of 5 corresponds 
to a brightness ratio of 1 00: 1 , a proposal eventually adopted by international 
agreement early in the twentieth century. 

Astronomers who observe in the visual and near infrared persist in using 
this system. It has advantages: for example, all astronomical bodies have 
apparent magnitudes that fall in the restricted and easy-to-comprehend range 
of about —26 (the Sun) to +30 (the faintest telescopic objects). However, 
when Hipparchus and Pogson assigned the more positive magnitudes to the 
fainter objects, they were asking for trouble. Avoid the trouble and remember 
that smaller (more negative) magnitudes mean brighter objects. Those who 
work at other wavelengths are less burdened by tradition and use less confus- 
ing (but sometimes less convenient) units. Such units linearly relate to the 
apparent brightness, F, or to the monochromatic brightness,// In radio astron- 
omy, for example, one often encounters the jansky (1 Jy = 10 _26 Wm 2 Hz ' ) 
as a unit for f v . 

The relationship between apparent magnitude, m, and brightness, F, is: 

m = -2.5 log 10 (F) + K (1.4) 

The constant K is often chosen so that modern measurements agree, more or 
less, with the older catalogs, all the way back to Hipparchus. For example, the 
bright star, Vega, has mm 0 in the modern magnitude system. If the flux in 
Equation (1.4) is the total or bolometric flux (see Section 1.4) then the mag- 
nitude defined is called the apparent bolometric magnitude , m ho Most prac- 
tical measurements are made in a restricted band-pass, but the modern 
definition of such a band-pass magnitude remains as in Equation (1.4), even 
to the extent that K is often chosen so that Vega has mm 0 in any band. This 
standardization has many practical advantages, but is potentially confusing, 
since the function /) is not at all flat for Vega (see Figure 1.5) or any other star, 



28 



Light 



nor is the value of the integral in Equation (1.3) similar for different bands. In 
most practical systems, the constant K is specified by defining the values of m 
for some set of standard stars. Absolute calibration of such a system, so that 
magnitudes can be converted into energy units, requires comparison of at least 
one of the standard stars to a source of known brightness, like a blackbody at a 
stable temperature. 

Consideration of Equation (1.4) leads us to write an equation for the magni- 
tude difference between two sources as 



This equation holds for both bolometric and band-pass magnitudes. It should 
be clear from Equation (1.5) that once you define the magnitudes of a set of 
standard stars, measuring the magnitude of an unknown is a matter of measuring 
a flux ratio between the standard and the unknown. Magnitudes are thus almost 
always measured in a differential fashion without conversion from detector 
response to absolute energy units. It is instructive to invert Equation (1.5), 
and write a formula for the flux ratio as a function of magnitude difference: 



A word about notation: you can write magnitudes measured in a band-pass in 
two ways, by (1) using the band-pass name as a subscript to the letter “ m ”, or (2) 
by using the name itself as the symbol. So for example, the B band apparent 
magnitudes of a certain star could be written as m B = 5.67 or as B = 5.67 and its 
visual band magnitude written as my = V = 4.56. 

1.6.2 Absolute magnitudes 

The magnitude system can also be used to express the luminosity of a source. 
The absolute magnitude of a source is defined to be the apparent magnitude 
(either bolometric or band-pass) that the source would have if it were at the 
standard distance of 10 parsecs in empty space (1 parsec = 3.086 X 10 21 meters 
— see Chapter 3). The relation between the apparent and absolute magnitudes of 
the same object is 



where M is the absolute magnitude and r is the actual distance to the source in 
parsecs. The quantity ( m — M) on the left-hand side of Equation (1.6) depends 
only on distance, and is called the distance modulus of the source. You should 
recognize this equation as the equivalent of the inverse square law relation 
between apparent brightness and luminosity (Equation 1.2). Equation (1.6) must 




(1.5) 




m — M = 5 log(r) — 5 



( 1 . 6 ) 




1.6 Magnitudes 



29 



be modified if the source is not isotropic or if there is absorption along the path 
between it and the observer. 

To symbolize the absolute magnitude in a band-pass, use the band name as a 
subscript to the symbol M. The Sun, for example, has absolute magnitudes: 

M B = 5.48 
My =4.83 
M bol =4.75 



1.6.3 Measuring brightness or apparent magnitude 
from images 

We will consider in detail how to go about measuring the apparent magnitude of 
a source in Chapter 10. However, it is helpful to have to have a simplified 
description of what is involved. Imagine a special detector that will take pictures 
of the sky through a telescope — a visible-light image of stars, similar to what 
you might obtain with a black-and-white digital camera. 

Our picture, shown in Figure 1.9, is composed of a grid of many little square 
elements called picture e/ements or pixels. Each pixel stores a number that is 
proportional to the energy that reaches it during the exposure. Figure 1.9 dis- 
plays this data by mimicking a photographic negative: each pixel location is 
painted a shade of gray, with the pixels that store the largest numbers painted 
darkest. 

The image does not show the surfaces of the stars. Notice that the images of 
the stars are neither uniform nor hard-edged — they are most intense in the center 
and fade out over several pixels. Several effects cause this — the finite resolving 
power of any lens and scattering of photons by molecules and particles in the air, 
for example, or the scattering of photons from pixel to pixel within the detector 
itself. The diameter of a star image in the picture has to do with the strength of 
the blurring effect and the choice of grayscale mapping — not with the physical 






Fig. 1.9 A digital image. 
The width of each star 
image, despite 
appearances, is the same. 
Width is measured as the 
width at the brightness 
level equal to half the 
peak brightness. Bright 
stars have a higher peak 
and their half-power 
levels are at a higher 
brightness or gray level. 




30 



Light 



size of the star. Despite appearances, the size of each star image is actually the 
same when scaled by peak brightness. 

Suppose we manage to take a single picture with our camera that records an 
image of the star Vega as well as that of some other star whose brightness we 
wish to measure. To compute the brightness of a star, we add up the energies it 
deposits in each pixel in its image. If E xy is the energy recorded in pixel x, y due 
to light rays from the star, then the brightness of the star will be 



where t is the exposure time in seconds and A is the area of the camera lens. 
Measuring F is just a matter of adding up the E xy s. 

Of course things are not quite so simple. A major problem stems from the fact 
that the detector isn’t smart enough to distinguish between light rays coming 
from the star and light rays coming from any other source in the same general 
direction. A faint star or galaxy nearly in the same line of sight as the star, or a 
moonlit terrestrial dust grain or air molecule floating in front of the star, can 
make unwelcome additions to the signal. All such sources contribute back- 
ground light rays that reach the same pixels as the star’s light. In addition, 
the detector itself may contribute a background that has nothing to do with 
the sky. Therefore, the actual signal, S xv , recorded by pixel x, y will be the 
sum of the signal from the star, E xy , and that from the background, B xy , or 



The task then is to determine B xv so we can subtract it from each E xy . You can 
do this by measuring the energy reaching some pixels near but not within a star 
image, and taking some appropriate average (call it B). Then assume that every- 
where in the star image, B xv is equal to B. Sometimes this is a good assumption, 
sometimes not so good. Granting the assumption, then the brightness of the star is 



Notice that Equation (1.7) contains no reference to the exposure time, t, nor 
to the area of the camera lens, A. Also notice that since a ratio is involved, the 
pixels need not record units of energy — anything proportional to energy will do. 
Furthermore, although we have used Vega as a standard star in this example, it 
would be possible to solve Equation (1.7) for m star without using Vega, by 
employing either of two different strategies: 




Fstar . X [‘Sty B] star 

1/1 xy 



Tstar , 



The apparent magnitude difference between the star and Vega is 




( 1 . 7 ) 



Summary 



31 



Differential photometry. Replace Vega with any other star whose magnitude is 
known that happens to be in the same detector field as the unknown, then employ 
Equation (1.7). Atmospheric absorption effects should be nearly identical for 
both the standard and the unknown if they are in the same image. 

All-sky photometry. Take two images, one of the star to be measured, the second 
of a standard star, keeping conditions as similar as possible in the two expo- 
sures. Ground-based all-sky photometry can be difficult in the optical-NIR 
window, because you must look through different paths in the air to observe 
the standard and program stars. Any variability in the atmosphere (e.g. clouds) 
defeats the technique. In the radio window, clouds are less problematic and the 
all-sky technique is usually appropriate. In space, of course, there are no atmos- 
pheric effects, and all-sky photometry is usually appropriate at every wave- 
length. 



Summary 

• Astronomers can gather information about objects outside the Earth from multi- 
ple channels: neutrinos, meteorites, cosmic rays, and gravitational waves; how- 
ever, the vast majority of information arrives in the form of electromagnetic 
radiation. 

• The wave theory formulated by Maxwell envisions light as a self-propagation 
transverse disturbance in the electric and magnetic fields. 

• A light wave is characterized by wavelength or frequency a = c/v, and exhibits the 
properties of diffraction and interference. 

• Quantum mechanics improves on the wave theory, and describes light as a stream 
of photons, massless particles that can exhibit wave properties. 

• A simple but useful description of light postulates a set of geometric rays that carry 
luminous energy, from a source of a particular luminosity to an observer who 
locally measures an apparent brightness, flux, or irradiance. 

• The spectrum of an object gives its brightness as a function of wavelength or 
frequency. 

• The Kirchhoff— Bunsen rules specify the circumstances under which an object 
produces an emission line, absorption line, or continuous spectrum. 

• Line spectra contain information about (among other things) chemical composi- 
tion. However, although based on patterns of absorption lines, the spectral types of 
stars depend primarily on stellar temperatures. 

• A blackbody, described by Planck's law, Wien’s law, and the Stefan 
Boltzmann law, emits a continuous spectrum whose shape depends only on its 
temperature. 

• The astronomical magnitude system uses apparent and absolute magnitudes to 
quantify brightness measurements on a logarithmic scale. 

{continued) 




32 



Light 



Summary ( cont .) 

• Photometry and spectroscopy are basic astronomical measurements. They often 
depend on direct comparison to standard objects, and the use of standard band- 
passes. 

• Important constants and formulae: 

leV= 1.602 X 10 19 J 

h = Planck’s constant = 6.626 X 10~ j4 Js 
1 Jansky = 10~ 26 Wm~ 2 Hz~‘ 

L 0 =3.87 X 10 26 W 

Solar constant (flux at the top of the atmosphere) = 1370 W nW 2 

a = Stefan— Boltzmann constant = 5.6696 X 10~ 8 Wm~ 2 K -4 

k = Boltzmann constant = 1.3806 X 10“ 23 JKT 1 = 8.6174 X 10“ 5 eVKT 1 

Energy of a photon: E — hv 

Inverse square law of light: F — , 

4717— 

Stefan— Boltzmann law: s = oT 4 
Wein’s law: Ti MAX — 2.8979 X 10~ 3 m ■ K 
Magnitude difference and flux: 

Am = m\ — m 2 = —2.5 log| 0 

_ jq-0.4A/m _ |Q 0.4(//;i m'i) 

f 2 

Distance modulus: 

m — M = 5 log(T-) — 5 




Exercises 

‘Therefore,’ I said, ‘by the use of problems, as in geometry, we shall also pursue 
astronomy, ... by really taking part in astronomy we are going to convert the 
prudence by nature in the soul from uselessness to usefulness.’ 

— Socrates, in Plato, The Republic, Book VII, 530b (c. 360 BC) 

1. Propose a definition of astronomy that distinguishes it from other sciences like 
physics and geology. 

2. What wavelength photon would you need to: 

(a) ionize a hydrogen atom (ionization energy =13.6 eV) 

(b) dissociate the molecular bond in a diatomic hydrogen molecule (dissociation 
energy = 4.48 eV) 

(c) dissociate a carbon monoxide molecule (dissociation energy = 11.02 eV) 




Exercises 



33 



3. What are the units of the monochromatic brightness,/!? 

4. What is the value of the ratio filf v for any source? 

5. (a) Define A(T) to be the average slope of the Planck function, B(X,T), between 

400 nm (violet) and 600 nm (yellow). In other words: 

A(T) = [5(600, T ) - 5(400, r)]/200nm 

Use a spreadsheet to compute and plot A(T), over the temperature range 2000 K 
to 30 000K (typical of most stars). What happens to A at very large and very 
small temperatures? 

(b) Compute and plot the function 

C{T) = log[5(400, 5)] - [log 5(600, T)\ 

over the same range of temperature. Again, what happens to C at the extremes of 
temperature? 

(c) Comment on the usefulness of A and C as indicators of the temperature of a 
blackbody. 

6. A certain radio source has a monochromatic flux density of 1 Jy at a frequency of 
1 MHz. What is the corresponding flux density in photon number? (How many 
photons arrive per m 2 in one second with frequencies between 1 000 000 Hz and 
1 000 001 Hz?) 

7. The bolometric flux from a star with m bol = 0 is about 2.65 X 10~ 8 W m~ 2 outside 
the Earth’s atmosphere. 

(a) Compute the value of the constant K in Equation (1.4) for bolometric 
magnitudes. 

(b) Compute the bolometric magnitude of an object with a total flux of: 

(i) one solar constant 

(ii) 1.0 Wm“ 2 

8. The monochromatic flux at the center of the B band-pass (440 nm) for a certain star 
is 375 Jy. If this star has a blue magnitude of m B = 4.71, what is the monochromatic 
flux, in Jy, at 440 nm for: 

(a) a star with m n = 8.33 

(b) a star with m B = —0.32 

9. A double star has two components of equal brightness, each with a magnitude of 
8.34. If these stars are so close together that they appear to be one object, what is the 
apparent magnitude of the combined object? 

10. A gaseous nebula has an average surface brightness of 17.77 magnitudes per square 
second of arc. 

(a) If the nebula has an angular area of 145 square arcsec, what is its total apparent 
magnitude? 

(b) If the nebula were moved to twice its original distance, what would happen to its 
angular area, total apparent magnitude, and surface brightness? 

11. At maximum light, type la supemovae are believed to have an absolute visual 
magnitude of —19.60. A supernova in the Pigpen Galaxy is observed to reach 



34 



Light 



apparent visual magnitude 13.25 at its brightest. Compute the distance to the Pigpen 
Galaxy. 

Derive the distance modulus relation in Equation (1.6) from the inverse square law 
relation in Equation (1.1). 

Show that, for small values of A777, the difference in magnitude is approximately 
equal to the fractional difference in brightness, that is 




Hint: consider the derivative of m with respect to F. 

14. An astronomer is performing synthetic aperture photometry on a single unknown star 
and standard star (review Section 1.6.3) in the same field. The data frame is in the 
figure below. The unknown star is the fainter one. If the magnitude of the standard is 
9.000, compute the magnitude of the unknown. 

Actual data numbers are listed for the frame in the table. Assume these are propor- 
tional to the number of photons counted in each pixel, and that the band-pass is 
narrow enough that all photons can be assumed to have the same energy. Remember 
that photometrically both star images have the same size. 



12 . 

13. 





34 


16 


26 


33 


37 


22 


25 


25 


29 


19 


28 


25 


22 


20 


44 


34 


22 


26 


14 


30 


30 


20 


19 


17 


31 


70 


98 


66 


37 


25 


35 


36 


39 


39 


23 


20 


34 


99 


229 


107 


38 


28 


46 


102 


159 


93 


37 


22 


33 


67 


103 


67 


36 


32 


69 


240 


393 


248 


69 


30 


22 


33 


34 


29 


36 


24 


65 


241 


363 


244 


68 


24 


28 


22 


17 


16 


32 


24 


46 


85 


157 


84 


42 


22 


18 


25 


27 


26 


17 


18 


30 


29 


35 


24 


30 


27 


32 


23 


16 


29 


25 


24 


30 


28 


20 


35 


22 


23 


28 


28 


28 


24 


26 


26 


17 


19 


30 


35 


30 


26 




Chapter 2 

Uncertainty 



Errare humanum est. 

— Anonymous Latin saying 

Upon foundations of evidence, astronomers erect splendid narratives about the 
lives of stars, the anatomy of galaxies or the evolution of the Universe. Inaccu- 
rate or imprecise evidence weakens the foundation and imperils the astronom- 
ical story it supports. Wrong ideas and theories are vital to science, which 
normally works by proving many, many ideas to be incorrect until only one 
remains. Wrong data, on the other hand, are deadly. 

As an astronomer you need to know how far to trust the data you have, or how 
much observing you need to do to achieve a particular level of trust. This 
chapter describes the formal distinction between accuracy and precision in 
measurement, and methods for estimating both. It then introduces the concepts 
of a population, a sample of a population, and the statistical descriptions of each. 
Any characteristic of a population (e.g. the masses of stars) can be described by 
a probability distribution (e.g. low-mass stars are more probable than high-mass 
stars) so we next will consider a few probability distributions important in 
astronomical measurements. Finally, armed with new statistical expertise, we 
revisit the question of estimating uncertainty, both in the case of an individual 
measurement, as well as the case in which multiple measurements combine to 
produce a single result. 

2.1 Accuracy and precision 

In common speech, we often do not distinguish between these two terms, but we 
will see that it is very useful to attach very different meanings to them. An 
example will help. 

2.1.1 An example 

In the distant future, a very smart theoretical astrophysicist determines that the 
star Malificus might soon implode to form a black hole, and in the process 
destroy all life on its two inhabited planets. Careful computations show that if 



35 



36 



Uncertainty 



Malificus is fainter than magnitude 14.190 by July 24, as seen from the observ- 
ing station orbiting Pluto, then the implosion will not take place and its planets 
will be spared. The Galactic government is prepared to spend the ten thousand 
trillion dollars necessary to evacuate the doomed populations, but needs to know 
if the effort is really called for; it funds some astronomical research. Four 
astronomers and a demigod each set up experiments on the Pluto station to 
measure the apparent magnitude of Malificus. 

The demigod performs photometry with divine perfection, obtaining a result 
of 14.123 010 (all the remaining digits are zeros). The truth, therefore, is that 
Malificus is brighter than the limit and will explode. The four astronomers, in 
contrast, are only human, and, fearing error, repeat their measurements — five 
times each. I’ll refer to a single one of these five as a “trial.” Table 2.1 lists the 
results of each trial, and Figure 2.1 illustrates them. 

2.1.2 Accuracy and systematic error 

In our example, we are fortunate a demigod participates, so we feel perfectly 
confident to tell the government, sorry, Malificus is doomed, and those addi- 
tional taxes are necessary. The accuracy of a measurement describes (usually 
numerically) how close it is to the “true” value. The demigod measures with 
perfect accuracy. 



Table 2.1 . Results of trials by four astronomers. The values fora and s are 
computed from Equations (2.3) and (2.1), respectively 



Astronomer 


A 


B 


C 


D 


Trial 1 


14.115 


14.495 


14.386 


14.2 


Trial 2 


14.073 


14.559 


14.322 


14.2 


Trial 3 


14.137 


14.566 


14.187 


14.2 


Trial 4 


14.161 


14.537 


14.085 


14.2 


Trial 5 


14.109 


14.503 


13.970 


14.2 


Mean 


14.119 


14.532 


14.190 


14.2 


Deviation from truth 


-0.004 


+ 0.409 


+ 0.067 


+ 0.077 


Spread 


0.088 


0.071 


0.418 


0 


o 


0.033 


0.032 


0.174 


0 


s 


0.029 


0.029 


0.156 


0 


Uncertainty of the mean 


0.013 


0.013 


0.070 


(0.05) 


Interpretation 


Evacuate 


Stay 


Uncertain 


Uncertain 


Accuracy? 


Accurate 


Inaccurate 


Accurate 


Inaccurate 


Precision? 


Precise 


Precise 


Imprecise 


Imprecise 



2.1 Accuracy and precision 



37 



1 — 

2 — 
Trial 3 — 

4 — 

5 — C 



r 

A | D 
A | D 

A (Jd 
C A ' D 



j L a 



f 



14.2 14.4 

Magnitude 



Fig. 2.1 Apparent 
magnitude measurements. 
The arrow points to the 
demigod's result, which 
is the true value. The 
dotted line marks the 
critical limit. 



What is the accuracy of the human results? First, decide what we mean by “a 
result”: since each astronomer made five trials, we choose a single value that 
summarizes these five measurements. In this example, each astronomer chooses 
to compute the mean — or average — of the five, a reasonable choice. (We will see 
there are others.) Table 2.1 lists the mean values from each astronomer — 
“results” that summarize the five trials each has made. 

Since we know how much each result deviates from the truth, we could 
express its accuracy with a sentence like: “Astronomer B’s result is relatively 
inaccurate,” or, more specifically: “The result of Astronomer A is 0.004 mag- 
nitude smaller than the true value.” Statements of this kind are easy to make 
because the demigod tells us the truth, but in the real Universe, how could you 
determine the “true” value, and hence the accuracy? In science, after all, the 
whole point is to discover values that are unknown at the start, and no demigods 
work at observatories. 

Flow, then, can we judge accuracy? The alternative to divinity is variety. We 
can only repeat measurements using different devices, assumptions, strategies, 
and observers, and then check for general agreements (and disagreements) 
among the results. We suspect a particular set-up of inaccuracy if it disagrees 
with all other experiments. For example, the result of Astronomer B differs 
appreciably from those of his colleagues. Even in the absence of the demigod 
result, we would guess that B’s result is the least accurate. In the absence of the 
demigod, in the real world, the best we can hope for is a good estimate of 
accuracy. 

If a particular set-up always produces consistent (estimated) inaccuracies, if 
its result is always biased by about the same amount, then we say it produces a 
systematic error. Although Astronomer B’s trials do not have identical out- 
comes, they all tend to be much too large, and are thus subject to a systematic 
error of around +0.4 magnitude. Systematic errors are due to some instrumental 
or procedural fault, or some mistake in modeling the phenomena under inves- 
tigation. Astronomer B, for example, used the wrong magnitude for the standard 
star in his measurements. He could not improve his measurement just by 



38 



Uncertainty 



repeating it — making more trials would give the same general result, and B 
would continue to recommend against evacuation. 

In a second example of inaccuracy, suppose the astrophysicist who computed 
the critical value of V = m v = 14.19 had made a mistake because he neglected 
the effect of the spin of Malificus. Then even perfectly accurate measurements 
of brightness would result in false (inaccurate) predictions about the collapse of 
the star. 

2.1.3 Precision and random error 

Precision differs from accuracy. The precision of a measurement describes how 
well or with what certainty a particular result is known, without regard to its 
truth. Precision denotes the ability to be very specific about the exact value of 
the measurement itself. A large number of legitimately significant digits in the 
numerical value, for example, indicates high precision. Because of the possi- 
bility of systematic error, of course, high precision does not mean high accuracy. 

Poor precision does imply a great likelihood of poor accuracy. You might 
console yourself with the possibility that an imprecise result could be accurate, 
but the Universe seldom rewards that sort of optimism. You would do better to 
regard precision as setting a limit on the accuracy expected. Do not expect 
accuracy better than your precision, and do not be shocked when, because of 
systematic error, it is a lot worse. 

Unlike accuracy, precision is often easy to quantify without divine assis- 
tance. You can determine the precision numerically by examining the degree 
to which multiple trials agree with one another. If the outcome of one trial 
differs from the outcome of the next in an unpredictable fashion, the scattering 
is said to arise from stochastic , accidental , or random error. (If the outcome of 
one trial differs from the next in a predictable fashion, then one should look for 
some kind of systematic error or some unrecognized physical effect.) The term 
random “error” is unfortunate, since it suggests some sort of mistake or failure, 
whereas you should really think of it as a scattering of values due to the uncer- 
tainty inherent in the measuring process itself. Random error limits precision, 
and therefore limits accuracy. 

To quantify random error, you could examine the spread in values for a finite 
number of trials: 



The spread will tend to be larger for experiments with the largest random 
error and lowest precision. A better description of the scatter or “dispersion” of 
a set of N trials, {x l5 x 2 ,. . . x N }, would depend on all N values. One useful 
statistic of this sort is the estimated standard deviation, s: 



spread = largest trial result — smallest trial result 




( 2 . 1 ) 



2.1 Accuracy and precision 



39 



We examine Equation (2.1) more carefully in later sections of this chapter. 
The values for s and for the spread in our example are in Table 2.1. These 
confirm the subjective impression from Figure 2. 1 — in relative terms, the results 
of the astronomers are as follows: 

A is precise and accurate; 

B is precise but inaccurate; 

C is imprecise and accurate (to the degree expected from the precision); 

D is a special imprecise case, discussed below. 

The basic statistical techniques for coping with random error and estimating 
the resulting uncertainty are the subjects of this chapter. A large volume of 
literature deals with more advanced topics in the statistical treatment of data 
dominated by stochastic error — a good introduction is the book by Bevington 
(1969). Although most techniques apply only to stochastic error, in reality, 
systematic error is usually the more serious limitation to good astronomy. 

Techniques for detecting and coping with systematic error are varied and 
indirect, and therefore difficult to discuss at an elementary level. Sometimes, 
one is aware of systematic error only after reconciling different methods for 
determining the same parameter. This is the case with Astronomer B, whose 
result differs from the others by more than the measured stochastic error. Some- 
times, what appears to be stochastic variation turns out to be a systematic effect. 
This might be the case with astronomer C, whose trial values decrease with time, 
suggesting perhaps some change in the instrument or environment. Although it 
is difficult to recognize systematic error, the fact that it is the consequence of 
some sort of mistake means that it is often possible to correct the mistake and 
improve accuracy. 

Stochastic error and systematic error both contribute to the uncertainty 
of a particular result. That result is useless until the size of its uncertainty is known. 



2.1.4 Uncertainty 

. . . one discovers over the years that as a rule of thumb accidental errors are twice 
as large as observers indicated, and systematic errors may be five times larger 
than indicated. 

— C. Jaschek, Error, Bias and Uncertainties in Astronomy (1990) 

In our Malificus example, the recommendation that each astronomer makes 
depends on two things — the numerical value of the result and the uncertainty 
the astronomer attaches to its accuracy. Astronomer A recommends evacuation 
because (a) her result is below the cut-off by 0.07 magnitude, and (b) the 
uncertainty she feels is small because her random error, as measured by 5 , is 
small compared to 0.07 and because she assumes her systematic error is also 
small. The assumption of a small systematic error is based mostly on A’s 



40 



Uncertainty 



confidence that she “knows what she is doing” and hasn’t made a mistake. 
Later, when she is aware that both C and D agree with her result, she can be even 
more sanguine about this. Astronomical literature sometimes makes a distinc- 
tion between internal error, which is the uncertainty computed from the scatter 
of trials, and external error, which is the total uncertainty, including systematic 
effects. 

Astronomer A should quote a numerical value for her uncertainty. If u is the 
uncertainty of a result, r, then the probability that the true value is between r + u 
and r — u is 1/2. 

Statistical theory (see below) says that under certain broad conditions, the 
uncertainty of the mean of A values is something like s/s/N . Thus, the uncer- 
tainty imposed by random error (the internal error) alone for A is about 0.013. 
The additional uncertainty due to systematic error is harder to quantify. The 
astronomer should consider such things as the accuracy of the standard star 
magnitudes and the stability of her photometer. In the end, she might feel that 
her result is uncertain (external error) by 0.03 magnitudes. She concludes the 
chances are much greater than 50% that the limit is passed, and thus must 
recommend evacuation in good conscience. 

Astronomer B goes through the same analysis as A, and recommends against 
evacuation with even greater (why?) conviction. Since quadrillions of dollars 
and billions of lives are at stake, it would be criminal for A and B not to confront 
their disagreement. They must compare methods and assumptions and try to 
determine which (if either) of them has the accurate result. 

Astronomer C shouldn’t make a recommendation because his uncertainty is 
so large. He can’t rule out the need for an evacuation, nor can he say that one is 
necessary. We might think C’s measurements are so imprecise that they are 
useless, but this is not so. Astronomer C’s precision is sufficient to cast doubt on 
B’s result (but not good enough to confirm A’s). The astronomers thus should 
first concentrate on B’s experimental method in their search for the source of 
their disagreement. Astronomer C should also be suspicious of his relatively 
large random error compared to the others. This may represent the genuine 
accidental errors that limit his particular method, or it may result from a system- 
atic effect that he could correct. 

2.1.5 Digitizing effects 

What about Astronomer D, who performed five trials that gave identical results? 
Astronomer D made her measurements with a digital light meter that only reads 
to the nearest 0.2 magnitude, and this digitization is responsible her very uni- 
form data. 

From the above discussion, it might seem that since her scatter is zero, then 
D’s measurement is perfectly precise. This is misguided, because it ignores what 
D knows about her precision: rounding off every measurement produces 



2.2 Populations and samples 



41 



uncertainty. She reasons that in the absence of random errors, there is a nearly 
100% chance that the true value is within ±0.1 of her measurement, and there is 
a nearly 50% chance that the true value lies within 0.05 magnitudes of her 
measurement. Thus, D would report an uncertainty of around ±0.05 magnitude. 
This is a case where a known systematic error (digitization) limits precision, and 
where stochastic error is so small compared to the systematic effect that it 
cannot be investigated at all. 

You can control digitization effects, usually by building a more expensive 
instrument. If you can arrange for the digitization effects to be smaller than the 
expected random error, then your measurements will exhibit a stochastic scatter 
and you can ignore digitization when estimating your precision. 

2.1.6 Significant digits 

One way to indicate the uncertainty in a measurement is to retain only those 
digits that are warranted by the uncertainty, with the remaining insignificant 
digits rounded off. In general, only one digit with “considerable” uncertainty 
(more than ± 1) should be retained. For example, Astronomer C had measured a 
value of 14.194 with an uncertainty of at least 0.156/ x/5 = 0.070. Astronomer 
C would realize that the last digit “4” has no significance whatever, the digit 
“1” is uncertain by almost ± 1, so the digit “9”, which has considerable uncer- 
tainty, is the last that should be retained. Astronomer C should quote his result as 
14.19. 

Astronomer A, with the result 14.119, should likewise recognize that her 
digit “ 1 ” in the hundredths place is uncertain by more than ± 1 , so she should 
round off her result to 14.12. 

It is also very good practice to quote the actual uncertainty. Usually one or 
two digits in the estimate of uncertainty are all that are significant. The first three 
astronomers might publish (internal errors): 

A’ s result: 14.12 ± 0.013 
B’s result: 14.53 ± 0.013 
C’s result: 14.19 ± 0.07 

Note that Astronomers A and C retain the same number of significant digits, 
even though A’s result is much more certain than C’s. Astronomer B, who is 
unaware of his large systematic error, estimates his (internal) uncertainty in 
good faith, but nevertheless illustrates Jaschek’s rule of thumb about under- 
estimates of systematic error. 

2.2 Populations and samples 

As some day it may happen that a victim must be found, 

I've got a little list — I’ve got a little list 



— W.S. Gilbert, The Mikado, Act I, 1885 



42 



Uncertainty 



In the Malificus problem, our fictional astronomers used simple statistical com- 
putations to estimate both brightness and precision. We now treat more system- 
atically the statistical analysis of observational data of all kinds, and begin with 
the concept of a population. 

Consider the problem of determining a parameter (e.g. the brightness of a 
star) by making several measurements under nearly identical circumstances. We 
define the population under investigation as the hypothetical set of all possible 
measurements that could be made with an experiment substantially identical to 
our own. We then imagine that we make our actual measurements by drawing a 
finite sample (five trials, say) from this much larger population. Some popula- 
tions are indefinitely large, or are so large that taking a sample is the only 
practical method for investigating the population. Some populations are finite 
in size, and sometimes are small enough to be sampled completely. Table 2.2 
gives some examples of populations and samples. 



Table 2.2. Populations and samples. Samples can be more or less 
representative of the population from which they are drawn 



Population 


Sample 


Better sample 


1000 colored marbles 


5 marbles drawn at 


50 marbles drawn at 


mixed in a container: 500 


random from the 


random 


red, 499 blue, 1 purple 


container 




The luminosities of each 


The luminosities of 


The luminosities of 


star in the Milky Way 


each of the 


100 stars at random 


galaxy (about 10 11 


nearest 100 stars 


locations in the 


values) 


(100 values) 


galaxy (100 values) 


The weights of every 


The weights of each 


The weights of 100 


person on Earth 


person in this 


people drawn from 




room 


random locations on 
Earth 


The outcomes of all 


The outcome of 1 


The outcomes of 100 


possible experiments in 
which one counts the 
number of photons that 
arrive at your 
detector during one 
second from the star 
Malificus 


such experiment 


such experiments 



2.2 Populations and samples 



43 



2.2.1 Descriptive statistics of a finite population 

But all evolutionary biologists know that variation itself is nature’s only 
irreducible essence. Variation is the hard reality, not a set of imperfect measures 
for a central tendency 

— Stephen Jay Gould, The Median Isn’t the Message, Bully for Brotosaurus , 1991 

Imagine a small, finite population, which is a set of M values or members, 
{x\,X 2 The list of salaries of the employees in a small business, like 

the ones in Table 2.3, would be an example. We can define some statistics that 
summarize or describe the population as a whole. 

Measures of the central value. If every value in the population is known, a 
familiar descriptive statistic, the population mean , is just 



Two additional statistics also measure the central or representative value of 
the population. The median , or midpoint, is the value that divides the population 
exactly in half: just as many members have values above as have values below 
the median. If n(E) is the number of members of a population with a particular 
characteristic, E, then the median, satisfies 

n(xi < p l/2 ) = n(%i > /i 1/2 ) « y 

Compared to the mean, the median is a bit more difficult to compute if M is 
large, since you have to sort the list of values. In the pre-sorted list in Table 2.3, 
we can see by inspection that the median salary is $30,000, quite a bit different 
from the mean ($300,000). The third statistic is the mode , which is the most 
common or most frequent value. In the example, the mode is clearly $15,000, 
the salary of the four astronomers. In a sample in which there are no identical 
values, you can still compute the mode by sorting the values into bins, and then 
searching for the bin with the most members. Symbolically, if q max is the mode, 
then 

n{xi = j( max ) > nfa =y,y^ j( max ) 



Table 2.3. Employee salaries at Astroploitcom 



Job title (number of employees) 


Salary in thousands of dollars 


President (1 ) 


2000 


Vice president (1 ) 


500 


Programmer (3) 


30 


Astronomer (4) 


15 



44 



Uncertainty 



Fig. 2.2 Histogram of the 
data in Table 2.4. 




Speed (km s' 1 ) 



Which measure of the central value is the “correct” one? The mean, median, 
and mode all legitimately produce a central value. Which one is most relevant 
depends on the question being asked. In the example in 2.3, if you were inter- 
ested in balancing the corporate accounts, then the mean would be most useful. 
If you are interested in organizing a workers’ union, the mode might be more 
interesting. 

Measures of dispersion. How scattered are the members of a population? 
Are values clustered tightly around the central value, or are many members 
significantly different from one another? Table 2.4 gives the speeds of stars 
in the direction perpendicular to the Galactic plane. Two populations differ in 
their chemical compositions: one set of 25 stars (Group A) contains the nearby 
solar-type stars that most closely match the Sun in iron abundance. Group A is 
relatively iron-rich. A second group, B, contains the 25 nearby solar-type stars 
that have the lowest known abundances of iron in their atmospheres. Figure 2.2 
summarizes the table with a histogram. Clearly, the central value of the speed is 
different for the two populations. Group B stars, on average, zoom through the 
plane at a higher speed than do members of Group A. 

A second difference between these populations — how spread out they are — 
concerns us here. The individual values in Group B are more dispersed than 
those in Group A. Figure 2.2 illustrates this difference decently, but we want a 
compact and quantitative expression for it. To compute such a statistic, we first 
examine the deviation of each member from the population mean: 

deviation from the mean = (x,- — p) 

Those values ofx, that differ most from p will have the largest deviations. The 
definition of p insures that the average of all the deviations will be zero (positive 
deviations will exactly balance negative deviations), so the average deviation is 
an uninteresting statistic. The average of all the squares of the deviations, in 
contrast, must be a positive number. This is called the population variance'. 



2.2 Populations and samples 



45 



Table 2.4. Speeds perpendicular to the Galactic plane, in km s 1 , for 50 nearby solar type stars 



Group A: 25 Iron-rich stars Group B: 25 Iron-poor stars 



0.5 


7.1 


9.2 


14.6 


18.8 


0.3 


7.9 


16.8 


35.9 


48.3 


1.1 


7.5 


10.7 


15.2 


19.6 


0.4 


10.0 


18.1 


38.8 


55.5 


5.5 


7.8 


12.0 


16.1 


24.2 


2.5 


10.8 


23.1 


42.2 


61.2 


5.6 


7.9 


14.3 


17.1 


26.6 


4.2 


14.5 


26.0 


42.3 


67.2 


6.9 


8.1 


14.5 


18.0 


32.3 


6.1 


15.5 


32.1 


46.6 


76.6 



1 M 1 M 



( 2 . 2 ) 



The variance tracks the dispersion nicely — the more spread out the popula- 
tion members are, the larger the variance. Because the deviations enter Equation 
(2.2) as quadratics, the variance is especially sensitive to population members 
with large deviations from the mean. 

The square root of the population variance is called the standard deviation 
of the population . 



/ 1 M 


(2.3) 


V 1 1 





o has the same dimensions as the population values themselves. For example, 
the variance of Group A in 2.4 is 57.25 km 2 s -2 , and the standard deviation is 
7.57 km s -1 . The standard deviation is usually the statistic employed to measure 
population spread. The mean of Group A is 12.85 km s~ 1 and examination of 
Figure 2.2 or the table shows that a deviation of er = 7.57 km s -1 is “typical” for 
a member of this population. 



2.2.2 Estimating population statistics 

Many populations are so large that it is impractical to tabulate all members. This 
is the situation with most astronomical measurements. In this case, the strategy 
is to estimate the descriptive statistics for the population from a small sample. 
We chose the sample so that it represents the larger population. For example, a 
sample of five or ten trials at measuring the brightness of the star Malificus 
represents the population that contains all possible equivalent measurements of 
its brightness. Most scientific measurements are usually treated as samples of a 
much larger population of possible measurements. 

A similar situation arises if we have a physical population that is finite but 
very large (the masses of every star in the Milky Way, for example). We can 




46 



Uncertainty 



discover the characteristics of this population by taking a sample that has only N 
members (masses for 50 stars picked at random in the Milky Way). 

In any sampling operation, we estimate the population mean from the sample 
mean, x. All other things being equal, we believe a larger sample will give a 
better estimate. In this sense, the population mean is the limiting value of the 
sample mean, and the sample mean is the best estimator of the population mean. 
If the sample has N members: 

1 £ 

u = lim — V Xi = lim x 
fi ~ x 

To estimate the population variance from a sample, the best statistic is .v 2 . the 
sample variance computed with (N — 1) weighting. 

•* 2 = E (** - ( 2 - 4 ) 

-tV A 1=1 



a 



2 



S 



2 



The (N— 7) _1 factor in Equation (2.4) (instead of just A -1 ) arises because x is 
an estimate of the population mean, and is not p itself. The difference is perhaps 
clearest in the case where N = 2. For such small N, it is likely that x, f p. Then 
the definition of x guarantees that 

E ( *<■ - /0 2 > E (*» - X ) 1 

i= 1 1=1 



which suggests that N weighting will consistently underestimate the 
population variance. In the limit of large N, the two expressions are equivalent: 



= lim — V (xi — u) = lim — Y (xi — x ) 2 = lim 



N ; 



°N 



ZTJ E ( x i - 

1 i=i 



x) 1 



= lim s 



Proof that Equation (2.4) is the best estimate of er 2 can be found in elementary 
references on statistics. The square root of .v 2 is called the standard deviation of 
the sample. Since most astronomical measurements are samples of a population, 
the dispersion of the population is usually estimated as 




(2.5) 



which is the expression introduced at the beginning of the chapter (Equa- 
tion (2.1)). 

The terminology for s and a can be confusing. It is unfortunately common to 
shorten the name for s to just “the standard deviation,” and to represent it with 
the symbol a. Y ou, the reader, must then discern from the context whether the 
statistic is an estimate (i.e. s, computed from a sample by Equation (2.5)) or a 
complete description (i.e. a, computed from the population by Equation (2.3)). 




2.3 Probability distributions 



47 



2.3 Probability distributions 

The most important questions of life are, for the most part, really only problems 
of probability. 

— Pierre-Simon Laplace, A Philosophical Essay on Probabilities 



2.3.1 The random variable 



Since scientific measurements generally only sample a population, we consider 
the construction of a sample a little more carefully. Assume we have a large 
population, Q. For example, suppose we have ajar full of small metal spheres of 
differing diameters, and wish to sample those diameters in a representative 
fashion. Imagine doing this by stirring up the contents, reaching into the jar 
without looking, and measuring the diameter of the sphere selected. This oper- 
ation is a trial, and its result is a diameter, x. We call x a random variable — its 
value depends not at all on the selection method (we hope). Although the value 
of x is unpredictable, there clearly is a function that describes how likely it is to 
obtain a particular value for x in a single trial. This function, Pq, is called the 
probability distribution of x in Q. In the case where x can take on any value over 
a continuous range, we define: 



Pq(x) dx = the probability that the result of a single trial will 
have a value between x and x + dx 

Sometimes a random variable is restricted to a discrete set of possible values. In 
our example, this would be the case if a ball-bearing factory that only made 
spheres with diameters that were integral multiples of 1 mm manufactured all 
the spheres. In this case, the definition of the probability distribution function 
has to be a little different: 

Pq(xj) = the probability that the result of a single trial will have 
a value Xj, where j = 1,2,3,... 

For our example, Pq might look like Figure 2.3, where (a) shows a continuous 
distribution in which any diameter over a continuous range is possible. Plot (b) 
shows a discrete distribution with only six possible sizes. 

In experimental situations, we sometimes know or suspect something about 
the probability distribution before conducting any quantitative trials. We might, 
for example, look into our jar of spheres and get the impression that “there seem 
to be only two general sizes, large and small.” Knowing something about the 
expected distribution before making a set of trials can be helpful in designing 
the experiment and in analyzing the data. Nature, in fact, favors a small number 
of distributions. Two particular probability distributions arise so often in astron- 
omy that they will repay special attention. 





Fig. 2.3 Probability 
distributions of the 
diameters of spheres, in 
millimeters, (a) A 
continuous distribution; 
(b) a discrete distribution, 
in which only six sizes are 
present. 



48 



Uncertainty 



2.3.2 The Poisson distribution 

The Poisson 1 distribution describes a population encountered in certain count- 
ing experiments. These are cases in which the random variable, x, is the number 
of events counted in a unit time: the number of raindrops hitting a tin roof in 1 
second, the number of photons hitting a light meter in 10 seconds, or the number 
of nuclear decays in an hour. For counting experiments where non-correlated 
events occur at an average rate, fi, the probability of counting x events in a single 
trial is 



P P (x,n) 




Here, P p {x , p) is the Poisson distribution. For example, If you are listening to 
raindrops on the roof in a steady rain, and on average hear 3.25 per second, then 
P p ( 0,3.25) is the probability that you will hear zero drops in the next one- 
second interval. Of course, P p (x, p) is a discrete distribution, with x restricted 
to non-negative integer values (you can never hear 0.266 drops, nor could you 
hear— 1 drops). Figure 2.4 illustrates the Poisson distribution for three different 
values of p. Notice that as p increases, so does the dispersion of the distribution. 
An important property of the Poisson distribution, in fact, is that its variance is 
exactly equal to its mean: 



a 



2 



= p 



This behavior has very important consequences for planning and analyzing 
experiments. For example, suppose you count the number of photons, N, that 
arrive at your detector in t seconds. If you count N things in a single trial, you 
can estimate that the average result of a single trial of length t seconds will be a 
count of p « x = N photons. How uncertain is this result? The uncertainty in a 
result can be judged by the standard deviation of the population from which the 
measurement is drawn. So, assuming Poisson statistics apply, the uncertainty of 
the measurement should be er = yfp « yfN. The uncertainty in the rate in units 
of counts per second would be a /t « fN / 1. The fractional uncertainty is: 

Fractional uncertainty in counting N events = — ~ — = 

P VN 



1 Simeon-Denis Poisson (1781—1840) in youth resisted his family's attempts to educate him in 
medicine and the law. After several failures in finding an occupation that suited him, he became 
aware of his uncanny aptitude for solving puzzles, and embarked on a very prolific career in 
mathematics, becoming Laplace’s favorite pupil. Poisson worked at a prodigious rate, both in 
mathematics and in public service in France. Given his rather undirected youth, it is ironic that he 
characterized his later life with his favorite phrase: “La vie, c’est le travail.” 




2.3 Probability distributions 



49 




Fig. 2.4 The Poisson 
distribution for values 
of /i = 1 .4 (filled circles), 
fi = 2.8 (open circles), and 
fi = 8.0 (open triangles). 
Note that only the plotted 
symbols have meaning 
as probabilities. The 
curves merely assist the 
eye in distinguishing the 
three distributions. 



Thus, to decrease the uncertainty in your estimate of the photon arrival rate \jJt, 
you should increase the number of photons you count (by increasing either the 
exposure time or the size of your telescope). To cut uncertainty in half, for 
example, increase the exposure time by a factor of 4. 



2.3.3 The Gaussian, or normal, distribution 

The Gaussian", or normal, distribution is the most important continuous distri- 
bution in the statistical analysis of data. Empirically, it seems to describe the 
distribution of trials for a very large number of different experiments. Even in 
situations where the population itself is not described by a Gaussian (e.g. Figure 
2.3) estimates of the summary statistics of the population (e.g. the mean) are 
described by a Gaussian. 

If a population has a Gaussian distribution, then in a single trial the proba- 
bility that x will have a value between x and x + dr is 



P G (x, fi, a)dx 




1 

2 




( 2 . 6 ) 



Figure 2.5 illustrates this distribution, a shape sometimes called a bell cun’e. In 
Equation (2.6), /( and er are the mean and standard deviation of the distribution, 
and they are independent of one another (unlike the Poisson distribution). We 
will find it useful to describe the dispersion of a Gaussian by specifying its 



2 Karl Friedrich Gauss (1777—1855) was a child prodigy who grew to dominate mathematics during 
his lifetime. He made several important contributions to geometry and number theory in his early 
20s, after rediscovering many theorems because he did not have access to a good mathematics 
library. In January 1 80 1 the astronomer Piazzi discovered Ceres, the first minor planet, but the 
object was soon lost. Gauss immediately used a new method and only three recorded observations 
to compute the orbit of the lost object. His predicted positions led to the recovery of Ceres, fame, 
and eventually a permanent position at Gottingen Observatory. At Gottingen, Gauss made impor- 
tant contributions to differential geometry and to many areas of physics, and was involved in the 
invention of the telegraph. 



50 



Uncertainty 




z 



Fig. 2.5 (a) A Gaussian 
distribution with a mean 
of 5 and a standard 
deviation of 2.1 . The curve 
peaks at P= 1.2, and the 
FWHM is drawn at half this 
level. In contrast to the 
Poisson distribution, note 
that P is defined for 
negative values of x. 

(b) The standard normal 
distribution. 



full width at half-maximum (FWHM), that is, the separation in x between the 
two points where 

P G {x,n,a) = ^P a (p,p,o) 

The FWHM is proportional to <r. 

FWHM Gaussian = 2.354 a 

The dispersion of a distribution determines how likely it is that a single 
sample will turn out to be close to the population mean. One measure of dis- 
persion, then, is the probable error, or P.E. By definition, a single trial has a 
50% probability of lying closer to the mean than the P.E., that is 

Probability that (\x — p\ < P.E.) = 1/2 

The P.E. for a Gaussian distribution is directly proportional to its standard 
deviation: 



(P- E -)_ = °' 6745ff = 0.2865(FWHM) 

2.3.4 The standard normal distribution 

Pq{x , n, a) is difficult to tabulate since its value depends not only on the vari- 
able, x, but also on the two additional parameters, p and er. This prompts us to 
define a new random variable: 



X — \i 



dz = a dx 



(2.7) 



After substitution in (2.6), this gives: 



Psn(z) = Pa( z , 0, 1) 




z 2 ' 

2 



( 2 . 8 ) 



Equation (2.8), the Gaussian distribution with zero mean and unit variance, is 
tabulated in Appendix Cl. You can extract values for a distribution with a 
specific /( and a from the table through Equations (2.7). 



2.3.5 Other distributions 

Many other distributions describe populations in nature. We will not discuss 
these here, but only remind you of their existence. You have probably encoun- 
tered some of these in everyday life. A uniform distribution, for example, 
describes a set of equally likely outcomes, as when the value of the random 
variable, x, is the outcome of the roll of a single die. Other distributions are 
important in elementary physics. The Maxwell distribution, for example, 



2.4 Estimating uncertainty 



51 



describes the probability that in a gas of temperature T a randomly selected 
particle will have energy E. 



2.3.6 Mean and variance of a distribution 



Once the distribution is known, it is a simple matter to compute the population 
mean. Suppose P(x,fi,(r) is a continuous distribution, and P’(x i , p! , <r')is a 
discrete distribution. The mean of each population is: 



and 



P = 



/”oo xPAx 




xPdx 



(2.9) 



X XiP'{xi,n',(j') . „ 

1'' = Z XiP’ixui/y) (2.10) 

X P(xt,n',i t') ,= ~” 

i =— oo 



If P' and P are, respectively, the probability and the probability density, then 
the terms in the denominators of the above equations (the “normalizations,” N 
and N 1 ) should equal one. 

Population variance (and standard deviation) also is easy to compute from 
the distribution 



^ J i x ~ p) 2 Pdx = i J xrPdx - /( 2 



and 



N'J 



X {Xi- li') 2 P'{x i ,p', o') = — X x 2 P'{xi,ii',o')-Li' 



( 2 . 11 ) 



( 2 . 12 ) 



2.4 Estimating uncertainty 

We can now address the central issue of this chapter: How do you estimate the 
uncertainty of a particular quantitative measurement? You now recognize most 
measurements result from sampling the very large population of all possible 
measurements. We consider a very common situation: a scientist samples a 
population by making n measurements, and computes the mean of the sample. 
He knows that this sample mean is the best guess for the population mean. The 
question is: How good is this guess? How close to the population mean is the 
sample mean? What uncertainty should he attach to it? 

2.4.1 The Central Limit Theorem 

Return to the example of a very large population of metal spheres that have a 
distribution of diameters as illustrated by Figure 2.3a. This distribution is clearly 



52 



Uncertainty 





40 



(C) 














n. 


4 5 


6 


7 8 



Fig. 2.6 (a) The distribution 
of a sample of 100 trials of 
the random variablex 5 . 
The solid curve is the 
distribution of the 
individual x values. 
Distribution (b) is for a 
sample of 800 trials of the 
random variablex 5 . This 
distribution is 
approximately Gaussian 
with a standard deviation 
of 1.13. Distribution (c) is 
the same as (a), except the 
random variable isx 2 o- Its 
standard deviation is 0.54. 



not Gaussian. Nevertheless, properties of the Gaussian are relevant even for this 
distribution. Consider the problem of estimating the average size of a sphere. 
Suppose we ask Dora, our cheerful assistant, to conduct a prototypical experi- 
ment: select five spheres at random, measure them and compute the average 
diameter. The result of such an experiment is a new random variable, x 5 , which 
is an estimate of the mean of the entire non-Gaussian population of spheres. 
Dora is a tireless worker. She does not stop with just five measurements, but 
enthusiastically conducts many experiments, pulling out many spheres at ran- 
dom, five at a time, and tabulating many different values forx 5 . When we finally 
get her to stop measuring, Dora becomes curious about the distribution of her 
tabulated values. She plots the histograms shown in Figures 2.6a and 2.6b, the 
results for 100 and 800 determinations ofx 5 respectively. 

“Looks like a Gaussian,” says Dora. “In fact, the more experiments I do, the 
more the distribution of x 5 looks like a Gaussian. This is curious, because the 
original distribution of diameters (the solid curve in Figure 2.6a) was not 
Gaussian.” 

Dora is correct. Suppose that P(x) is the probability distribution for 
random variable x, where P(x) is characterized by mean fi and variance er 2 , 
but otherwise can have any form whatsoever. In our example, P(x) is the bimo- 
dal function plotted in Figure 2.3(a). The Central Limit Theorem states that if 
{x \ , X 2 , . . . , x„ } is a sequence of n independent random variables drawn from P, 
then as n becomes large, the distribution of the variables 

1 » 

x„ = - X X i 
»/ = 1 

will approach a Gaussian distribution with mean /.( and variance er 2 /« 

To illustrate this last statement, Dora computes the values of a new random 
variable x 20 , which is the mean of 20 individual xs. The distribution of 100 x 20 s 
is shown in Figure 2.6c. As expected, the new distribution has about one-half the 
dispersion of the one for the x 5 s. 

Since so many measurements in science are averages of individual experi- 
ments, the Central Limit Theorem means that the Gaussian distribution will be 
central to the analysis of experimental results. In addition, the conclusion that the 
variance of the average is proportional to 1 /n relates directly to the problem of 
estimating uncertainty. Since s, the estimated standard deviation, is the best guess 
for cr, we should estimate cr /( («) , the standard deviation ofx„, the mean, as 




Here, s is computed according to Equation (2.5) from the scatter in the n 
individual measurements. It is common to simply quote the value of <T l Jn) as the 
uncertainty in a measurement. The interpretation of this number is clear because 
the Central Limit Theorem implies that o^{n) is the standard deviation of an 



2.5 Propagation of uncertainty 



53 



approximately Gaussian distribution — one then knows, for example, that there 
is a 50% probability that x„ is within 0.6745er ; ,(«) of the “true” value, /<. 

2.4.2 Reducing uncertainty 

The Central Limit Theorem, which applies to all distributions, as well as the 
elementary properties of the Poisson distribution, which applies to counting 
experiments, both suggest that the way to reduce the uncertainty (and increase 
both precision and possibly accuracy) in any estimate of the population mean is 
repetition. Either increase the number of trials, or increase the number of things 
counted. (In astronomy, where a trial often involves counting photons, the two 
sometimes amount to the same thing.) If 77 is either the number of repetitions, or 
the number of things counted, then the basic rule is: 

relative uncertainty °c —= (2-14) 

\JN 

This implies that success, or at least experimental accuracy, in astronomy 
involves making N large. This means having a large telescope (i.e. able to 
collect many photons) for a long time (i.e. able to conduct many measurements). 
You should keep a number of very important cautions in mind while pondering 
the importance of Equation (2.14). 

• The cost of improved accuracy is high. To decrease uncertainty by a factor of 100, for 
example, you have to increase the number of experiments (the amount of telescope 
time, or the area of its light-gathering element) by a factor of 10,000. At some point the 
cost becomes too high. 

• Equation (2.14) only works for experiments or observations that are completely inde- 
pendent of one another and sample a stationary population. In real life, this need not be 
the case: for example, one measurement can have an influence on another by sensitiz- 
ing or desensitizing a detector, or the brightness of an object can change with time. In 
such cases, the validity of (2.14) is limited. 

• Equation (2.13) only describes uncertainties introduced by scatter in the parent 
population. You should always treat this as the very minimum possible uncertainty. 
Systematic errors will make an additional contribution, and often the dominant one. 



2.5 Propagation of uncertainty 

2.5.1 Combining several variables 

We consider first the special case where the quantity of interest is the sum or 
difference of more than one measured quantity. For example, in differential 
photometry, you are often interested in the magnitude difference 



Am = mi — m2 



54 



Uncertainty 



Here m , is the measured instrumental magnitude of a standard or comparison 
object, and m 2 is the instrumental magnitude of an unknown object. Clearly, 
the uncertainty in A in depends on the uncertainties in both m\ and m 2 . If these 
uncertainties are known to be o'! and a 2 then the uncertainty in Am is given by 

a 2 = a\ + a\ ( 2 - 15 ) 

The above formula could be stated: “the variance of a sum (or difference) is 
the sum of the variances.” Equation (2.15) could also be restated by saying that 
the uncertainties “add in quadrature.” 

Equation (2.15) suggests that a magnitude difference, Am, will be more 
uncertain than the individual magnitudes. This is true only with respect to 
random errors, however. For example, if a detector is badly calibrated and 
always reads out energy values that are too high by 20%, then the values of 
individual fluxes and magnitudes it yields will be very inaccurate. However, 
systematic errors of this kind will often cancel each other if a flux ratio or 
magnitude difference is computed. It is a very common strategy to use such 
differential measurements as a way of reducing systematic errors. 

A second special case of combining uncertainties concerns products or ratios 
of measured quantities. If, for example, one were interested in the ratio between 
two fluxes, F\ ± <j\ and F 2 ± a 2 




then the uncertainty in R is given by: 




( 2 . 16 ) 



One might restate this equation by saying that for a product (or ratio) the relative 
uncertainties of the factors add in quadrature. 



2.5.2 General rule 

In general, if a quantity, G, is a function of n variables, G = G{x \ , x 2 , X 3 , . . . , x„) , 
and each variable has uncertainty (or standard deviation) <j\ , o 2 , 03 , . . . , a n , then 
the variance in G is given by: 

^=1© < +C ° Var (2 ' 17) 

Here the term “covar” measures the effect of correlated deviations. We assume 
this term is zero. You should be able to verify that Equations (2.15) and (2.16) 
follow from this expression. 




2.5 Propagation of uncertainty 



55 



2.5.3 Several measurements of a single variable 

Suppose three different methods for determining the distance to the center of our 
Galaxy yield values 8.0 ± 0.3, 7.8 ± 0.7 and 8.25 ± 0.20 kiloparsecs. What is 
the best combined estimate of the distance, and what is its uncertainty? The 
general rule in this case is that if measurements yi,V 2 ,y 3 , ■ ■ -y„ have associated 
uncertainties 0i , 02 , 03 , . . . , <t„ , then the best estimate of the central value is 

= (2-18) 
i= 1 

where the uncertainty of the combined quantity is 



c n 

Z 

i=l 

If we define the weight of a measurement to be 




Equation (2.18) becomes 

n 

Zjw 

W = ~ n (2.19) 

Z W i 

i= 1 

Computed in this way, y c is called the weighted mean. Although the weights, 
{wi, W 2 , ■ ■ ■ ,w n }, should be proportional to the reciprocal of the variance of 
each of the y valuess, they can often be assigned by rather subjective 
considerations like the “reliability” of different astronomers, instruments, or 
methods. However assigned, each w, represents the relative probability that a 
particular observation will yield the “correct” value for the quantity in question. 

To complete our example, the best estimate for the distance to the center of 
the Galaxy from the above data would assign weights 11.1, 2.0 and 25 to the 
three measurements, resulting in a weighted mean of 

* = TTTTTT25 [1L1(8) + 2(7 ’ 8) + 25 ( 8 - 25 )! = 8 ' 15 k P c 

and an uncertainty of 

(11.1 + 2 + 25 )~ l/2 = 0.16 kpc 

Notice that the uncertainty of the combined result is less than the uncertainty of 
even the best of the individual results, and that of the three measurements, the 
one with the very large uncertainty (7.8 ± 0.7 kpc) has little influence on the 
weighted mean. 



56 



Uncertainty 



2.6 Additional topics 

Several topics in elementary statistics are important in the analysis of data but 
are beyond the scope of this introduction. The chi-square (\ 2 ) statistic measures 
the deviation between experimental measurements and their theoretically 
expected values (e.g. from an assumed population distribution). Tests based 
on this statistic can assign a probability to the truth of the theoretical assump- 
tions. Least-square fitting methods minimize the y 2 statistic in the case of an 
assumed functional fit to experimental data (e.g. brightness as a function of time, 
color as a function of brightness, . . .). 

You can find elementary discussions of these and many more topics in either 
Bevington (1969) or Lyons (1991). The books by Jaschek and Murtagh (1990) 
and by Wall and Jenkins (2003) give more advanced treatments of topics par- 
ticularly relevant to astronomy. 



Summary 



Precision , but not accuracy, can be estimated from the scatter in measure- 
ments. 

Standard deviation is the square root of the variance. For a population . 



1 M 1 M 



M, 



From a sample, the best estimate of the population variance is 

i N 

? 1 



N 



— r I (*/ - 

~ 1 i = 1 



X f 



Probability distributions describe the expected values of a random var- 
iable drawn from a parent population. The Poisson distribution describes 
measurements made by counting uncorrelated events like the arrival of 
photons. For measurements following the Poisson distribution, 

2 _ 

°Poisson /^Poisson 



The Gaussian distribution describes many populations whose values 
have a smooth and symmetric distribution. The Central Limit Theorem 
contends that the mean of n random samples drawn from a population of 
mean p and variance a 2 will take on a Gaussian distribution in the limit of 
large n. This is a distribution whose mean approaches f i and whose variance 
approaches a 2 ! n. For any distribution, the uncertainty (standard deviation) 
of the mean of n measurements of the variable x is 

= ^ 



Exercises 



57 



The variance of a function of several uncorrelated variables, each with its 
own variance, is given by 

4 - i 



, i „ j (Ty ~t" covar 
1=1 \dxij Xl 

For measurements of unequal variance, the weighted mean is 

* = i (*/*?) 

/= 1 



and the combined variance is 



1 A 



2 

c 






Exercises 

1 . There are some situations in which it is impossible to compute the mean value for a 
set of data. Consider this example. The ten crew members of the starship Nostromo 
are all exposed to an alien virus at the same time. The virus causes the deaths of nine 
of the crew at the following times, in days, after exposure: 

1.2, 1.8, 2.1, 2.4, 2.6, 2.9, 3.3, 4.0, 5.4 

The tenth crew member is still alive after 9 days, but is infected with the virus. Based 
only on this data: 

(a) Why can’t you compute the “average survival time” for victims of the Nostromo 
virus? 

(b) What is the “expected survival time”? (A victim has a 50-50 chance of surviv- 
ing this long.) Justify your computation of this number. 

2. An experimenter makes eleven measurements of a physical quantity, X, that can only 
take on integer values. The measurements are 

0,1,2,3,4,5,6,7,8,9,10 

(a) Estimate the mean, median, variance (treating the set as a sample of a popula- 
tion) and standard deviation of this set of measurements. 

(b) The same experimenter makes a new set of 25 measurements of X, and finds the 
that the values 

0,1,2,3,4,5,6,7,8,9,10 

occur 

0, 1,2, 3, 4, 5,4, 3,2, l,and 0 

times respectively. Again, estimate the mean, median, variance and standard 
deviation of this set of measurements. 



58 



Uncertainty 



3. Describe your best guess as to the parent distributions of the samples given in 
questions 2(a) and 2(b). 

4. Assume the parent distribution for the measurements in problem 2(b) is actually a 
Poisson distribution, (a) Explain how you would estimate the Poisson parameter, fi. 
From the resulting function, Pp(X,/i), compute (b) the probability of obtaining a 
value of exactly zero in a single measurement, and (c) the probability of obtaining a 
value of exactly 1 1 . 

5. Now assume the parent distribution for the measurements in problem 2(b) is actually 
a Gaussian distribution. From your estimate of Pg(x, p, a), again compute (a) the 
probability of obtaining a value of exactly zero in a single measurement, as well as 
(b) the probability of obtaining a value of exactly 1 1. You will have to decide what 
“exactly 0” etc. means in this context. 

6. Which distribution, Poisson or Gaussian, do you think is the better fit to the data in 
problem 2(b)? Suggest a method for quantifying the goodness of each fit. (Hint: see 
Section 2.6.) 

7. Compute the mean, median, and standard deviation of the population for group B in 
Table 2.4. 

8. An astronomer wishes to make a photon-counting measurement of a star’s brightness 
that has a relative precision of 5%. (a) How many photons should she count? (b) How 
many should she count for a relative precision of 0.5%? 

9. A star cluster is a collection of gravitationally bound stars. Individual stars in the 
cluster move about with different velocities but the average of all these should give 
the velocity of the cluster as a whole. An astronomer measures the radial velocities of 
four stars in a cluster that contains 1000 stars. They are 74, 41, 61, and 57 km s '. 
How any additional stars should he measure if he wishes to achieve a precision of 
2 km s -1 for the radial velocity of the cluster as a whole? 

10. The astronomer in problem 8 discovers that when she points her telescope to the 
blank sky near the star she is interested in, she measures a background count that is 
50% of what she measures when she points to the star. She reasons that the bright- 
ness of the star (the interesting quantity) is given by 

star = measurement — background 

Revise your earlier estimates. How many measurement photons should she count to 
achieve a relative precision of 5% in her determination of the star brightness? How 
many for 0.5%? 

1 1 . An astronomer makes three one-second measurements of a star’s brightness, count- 
ing 4 photons in the first, 81 in the second, and 9 in the third. What is the best 
estimate of the average photon arrival rate and its uncertainty? 

Compute this (a) by using Equation (2.18), then (b) by noting that a total of 
94 photons have arrived in 3 seconds. 

Explain why these two methods give such different results. Which method is the 
correct one? Is there anything peculiar about the data that would lead you to 
question the validity of either method? 



Exercises 



59 



12. We repeat problem 14 from Chapter 1, where a single unknown star and standard star 
are observed in the same field. The data frame is in the figure below. The unknown 
star is the fainter one. If the magnitude of the standard is 9.000, compute the 
magnitude of the unknown, as in problem 1.14, but now also compute the uncertainty 
of your result, in magnitudes. Again data numbers represent the number of photons 
counted in each pixel. 





34 


16 


26 


33 


37 


22 


25 


25 


29 


19 


28 


25 


22 


20 


44 


34 


22 


26 


14 


30 


30 


20 


19 


17 


31 


70 


98 


66 


37 


25 


35 


36 


39 


39 


23 


20 


34 


99 


229 


107 


38 


28 


46 


102 


159 


93 


37 


22 


33 


67 


103 


67 


36 


32 


69 


240 


393 


248 


69 


30 


22 


33 


34 


29 


36 


24 


65 


241 


363 


244 


68 


24 


28 


22 


17 


16 


32 


24 


46 


85 


157 


84 


42 


22 


18 


25 


27 


26 


17 


18 


30 


29 


35 


24 


30 


27 


32 


23 


16 


29 


25 


24 


30 


28 


20 


35 


22 


23 


28 


28 


28 


24 


26 


26 


17 


19 


30 


35 


30 


26 




Chapter 3 

Place, time, and motion 



Then, just for a minute. . . he turned off the lights. . . . And then while we all still 
waited I understood that the terror of my dream was not about losing just vision, 
but the whole of myself, whatever that was. What you lose in blindness is the 
space around you, the place where you are, and without that you might not exist. 
You could be nowhere at all. 

— Barbara Kingsolver, Animal Dreams, 1990 

Where is Mars? The center of our Galaxy? The brightest X-ray source? Where, 
indeed, are we? Astronomers have always needed to locate objects and events in 
space. As our science evolves, it demands ever more exact locations. Suppose, 
for example, an astronomer observes with an X-ray telescope and discovers a 
source that flashes on and off with a curious rhythm. Is this source a planet, a 
star, or the core of a galaxy? It is possible that the X-ray source will appear to be 
quite unremarkable at other wavelengths. The exact position for the X-ray 
source might be the only way to identify its optical or radio counterpart. Astron- 
omers need to know where things are. 

Likewise, knowing when something happens is often as important as where it 
happens. The rhythms of the spinning and orbiting Earth gave astronomy an 
early and intimate connection to timekeeping. Because our Universe is always 
changing, astronomers need to know what time it is. 

The “fixed stars” are an old metaphor for the unchanging and eternal, but 
positions of real celestial objects do change, and the changes tell stories. Planets, 
stars, gas clouds, and galaxies all trace paths decreed for them. Astronomers 
who measure these motions, sometimes only through the accumulated labors of 
many generations, can find in their measurements the outlines of nature’s 
decree. In the most satisfying cases, the measurements uncover fundamental 
facts, like the distances between stars or galaxies, or the age of the Universe, or 
the presence of planets orbiting other suns beyond the Sun. Astronomers need to 
know how things move. 



60 



3.1 Astronomical coordinate systems 



61 



3.1 Astronomical coordinate systems 

Any problem of geometry can easily be reduced to such terms that a knowledge 
of the lengths of certain straight lines is sufficient for its construction. 

— Rene Descartes, La Geometrie, Book I, 1637 

Descartes’ brilliant application of coordinate systems to solve geometric prob- 
lems has direct relevance to astrometry, the business of locating astronomical 
objects. Although astrometry has venerably ancient origins, 1 it retains a central 
importance in astronomy. 

3.1.1 Three-dimensional coordinates 

I assume you are familiar with the standard (x, y, z) Cartesian coordinate system 
and the related spherical coordinate system ( r , <p, 9 ), illustrated in Figure 3.1(a). 
Think for a moment how you might set up such a coordinate system in practice. 
Many methods could lead to the same result, but consider a process that consists 
of four decisions: 

1 . Locate the origin. In astronomy, this often corresponds to identifying some distinctive 
real or idealized object: the centers of the Earth, Sun, or Galaxy, for example. 

2. Locate the x—y plane. We will call this the “fundamental plane.” The fundamental 
plane, again, often has physical significance: the plane defined by the Earth’s equator — 
or the one that contains Earth’s orbit — or the symmetry plane of the Galaxy, for 
example. The z-axis passes through the origin perpendicular to the fundamental plane. 

3. Decide on the direction of the positive x-axis. We will call this the “reference direc- 
tion.” Sometimes the reference direction has a physical significance — the direction 
from the Sun to the center of the Galaxy, for example. The v-axis then lies in the 
fundamental plane, perpendicular to the x-axis. 

4. Finally, decide on a convention for the signs of the y- and z-axes. These choices 
produce either a left- or right-handed system — see below.” 

The traditional choice for measuring the angles is to measure the first coor- 
dinate, cf> (or 1), within the fundamental plane so that </> increases from the 
+x-axis towards the +y-axis (see Figure 3.1). The second angle, 0 (or £), is 
measured in a plane perpendicular to the fundamental plane increasing from 
the positive z axis towards the x—y plane. In this scheme, (j) ranges, in radians, 
from 0 to 2n and 0 ranges from 0 to 7t. A common alternative is to measure 
the second angle (j8 in the figure) from the x—y plane, so it ranges between —n/2 
and +n!2. 



Systematic Babylonian records go back to about 650 BC but with strong hints that the written 
tradition had Sumerian roots in the late third millennium. Ruins of megalithic structures with clear 
astronomical alignments date from as early as 4500 BC (Nabta, Egypt). 



62 



Place, time, and motion 





Fig. 3.1 Three-dimensional 
coordinate systems. 

(a) The traditional system 
is right-handed. 

(b) This system is left- 
handed, its axes are a 
mirror image of those in 
(a). In either system one 
can choose to measure the 
second angle from the 
fundamental plane (e.g. 
angle fS) instead of from 
the z axis (angles 0 or ()■ 




Fig. 3.2 A spherical 
triangle. You must 
imagine this figure is 
drawn on the surface of a 
sphere. A, B, and C are 
spherical angles; a, b, and 
c are arcs of great circles. 



The freedom to choose the signs of the y- and z-axes in step 4 of this 
procedure implies that there are two ( and only two) kinds of coordinate systems. 
One, illustrated in Figure 3.1a, is right-handed, if you wrap the fingers of your 
right hand around the z axis so the tips point in the + (f> direction (that is, from 
the +x axis towards the + y axis), then your thumb will point in the + z direction. 
In a left-handed system, like the (r, /., Q system illustrated in Figure 3.1(b), you 
use your left hand to find the + z direction. The left-handed system is the mirror 
image of the right-handed system. In either system, Pythagoras gives the radial 
coordinate as: 

r = 'Jx 1 + y 2 + z 2 

3.1.2 Coordinates on a spherical surface 

It is one of the things proper to geography to assume that the Earth as a whole is 
spherical in shape, as the universe also is. . . 

— Strabo, Geography, II, 2, 1 , c. AD 18 

If all points of interest are on the surface of a sphere, the r coordinate is 
superfluous, and we can specify locations with just two angular coordinates like 
(</>, 0) or (A, ft). Many astronomical coordinate systems fit into this category, so it 
is useful to review some of the characteristics of geometry and trigonometry on 
a spherical surface. 

1 . A great circle is formed by the intersection of the sphere and a plane that contains the 
center of the sphere. The shortest distance between two points on the surface of a 
sphere is an arc of the great circle connecting the points. 

2. A small circle is formed by the intersection of the sphere and a plane that does not 
contain the center of the sphere. 

3. The spherical angle between two great circles is the angle between the planes, or the 
angle between the straight lines tangent to the two great circle arcs at either of their 
points of intersection. 

4. A spherical triangle on the surface of a sphere is one whose sides are all segments of 
great circles. Since the sides of a spherical triangle are arcs, the sides can be measured 
in angular measure (i.e. radians or degrees) rather than linear measure. See Figure 3.2. 

5. The law of cosines for spherical triangles in Figure 3.2 is: 

cos a = cos b cose + sin/? sine cos A 
or 

cosA = cos B cos C + sin5 sinC cos a 



6. The law of sines is 



sinfl 


sin/; 


sine 


sin A 


sin 5 


sinC 






3.1 Astronomical coordinate systems 



63 




3.1.3 Terrestrial latitude and longitude 

“I must be getting somewhere near the center of the Earth. . .yes. . .but then I 
wonder what Latitude and Longitude I've got to?” (Alice had not the slightest 
idea what Latitude was, nor Longitude either, but she thought they were nice 
grand words to say.) 

— Lewis Carroll, Alice’s Adventures in Wonderland, 1897 

Ancient geographers introduced the seine-like latitude— longitude system for 
specifying locations on Earth well before the time Hipparchus of Rhodes 
(c. 190—120 BC) wrote on geography. Figure 3.3 illustrates the basic features 
of the system. 

In our scheme, the first steps in setting up a coordinate system are to choose 
an origin and fundamental plane. We can understand why Hipparchus, who 
believed in a geocentric cosmology, would choose the center of the Earth as 
the origin. Likewise, choice of the equatorial plane of the Earth as the funda- 
mental plane makes a lot of practical sense. Although the location of the equator 
may not be obvious to a casual observer like Alice, it is easily determined from 
simple astronomical observations. Indeed, in his three-volume book on geog- 
raphy, Eratosthenes of Alexandria (c. 275 — c. 194 BC) is said to have computed 
the location of the equator relative to the parts of the world known to him. At the 
time, there was considerable dispute as to the habitability of the (possibly too 
hot) regions near the equator, but Eratosthenes clearly had little doubt about 
their location. 

Great circles perpendicular to the equator must pass through both poles, and 
such circles are termed meridians. The place where one of these — the prime 
meridian — intersects the equator could constitute a reference direction (x-axis). 
Unfortunately, on Earth, there is no obvious meridian to use for this purpose. 
Many choices are justifiable, and for a long time geographers simply chose a 
prime meridian that passed though some locally prominent or worthy place. 
Thus, the latitude of any point on Earth was unique, but its longitude was not, 



Fig. 3.3 The latitude- 
longitude system. The 
center of coordinates is at 
C. The fundamental 
direction, line CX, is 
defined by the 
intersection of the prime 
meridian (great circle 
NGX) and the equator. 
Latitude, /?, and 
longitude, X, for some 
point, P, are measured as 
shown. Latitude is 
positive north of the 
equator, negative south. 
Astronomical longitude 
for Solar System bodies 
is positive in the direction 
opposite the planet's 
spin. (i.e. to the west on 
Earth). On Earth, 
coordinates traditionally 
carry no algebraic sign, 
but are designated as 
north or south latitude, 
and west or east 
longitude. The 
coordinate, /?, is the 
geocentric latitude. The 
coordinate actually used 
in practical systems is the 
geodetic latitude (see the 
text). 



64 



Place, time, and motion 




Fig. 3.4. Geocentric (ft) and 
geodetic (<j>) latitudes. Line 
PF is perpendicular to the 
surface of the reference 
spheroid, and 
approximately in the 
direction of the local 
vertical (local gravitational 
force). 



since it depended on which meridian one chose as prime. This was inconvenient. 
Eventually, in 1884, the “international” community (in the form of representa- 
tives of 25 industrialized countries meeting in Washington, DC, at the First 
International Meridian Conference) settled the zero point of longitude at the 
meridian of the Royal Observatory in Greenwich, located just outside London, 
England. 

You should note that the latitude coordinate, /?, just discussed, is called the 
geocentric latitude , to distinguish it from (j), the geodetic latitude. Geodetic 
latitude is defined in reference to an ellipsoid-of-revolution that approximates 
the actual shape of the Earth. It is the angle between the equatorial plane and a 
line perpendicular to the surface of the reference ellipsoid at the point in question. 

Figure 3.4 shows the north pole, N, equator, E, and center, O, of the Earth. 
The geocentric and geodetic latitudes of point P are ft and <f>, respectively. 
Geodetic latitude is easier to determine and is the one employed in specifying 
positions on the Earth. The widely used technique of global positioning satel- 
lites (GPS), for example, returns geodetic latitude, longitude, and height above a 
reference ellipsoid. To complicate things a bit more, the most easily determined 
latitude is the geographic latitude , the angle between the local vertical and the 
equator. Massive objects like mountains affect the geographic but not the geo- 
detic latitude and the two can differ by as much as an arc minute. Further 
complications on the sub-arc-second scale arise from short- and long-term 
motion of the geodetic pole itself relative to the Earth’s crust due to tides, 
earthquakes, internal motions, and continental drift. 

Planetary scientists establish latitude— longitude systems on other planets, 
with latitude usually easily defined by the object’s rotation, while definition 
of longitude depends on identifying some feature to mark a prime meridian. 

Which of the two poles of a spinning object is the “north” pole? In the Solar 
System, the preferred (but not universal ! ) convention is that the ecliptic — the plane 
containing the Earth’s orbit — defines a fundamental plane, and a planet’s north 
pole is the one that lies to the (terrestrial) north side of this plane. Longitude 
should be measured as increasing in the direction opposite the spin direction. 

For other objects, practices vary. One system says the north pole is deter- 
mined by a right-hand rule applied to the direction of spin: wrap the fingers of 
your right hand around the object’s equator so that they point in the direction of 
its spin. Your thumb then points north (in this case, “north” is in the same 
direction as the angular momentum vector). 



3.1.4 The altitude-azimuth system 

Imagine an observer, a shepherd with a well-behaved flock, say, who has some 
leisure time on the job. Our shepherd is lying in an open field, contemplating the 
sky. After a little consideration, our observer comes to imagine the sky as a 
hemisphere — an inverted bowl whose edges rest on the horizon. Astronomical 



3.1 Astronomical coordinate systems 



65 



objects, whatever their real distances, can be seen to be stuck onto or projected 
onto the inside of this hemispherical sky. 

This is another situation in which the r-coordinate becomes superfluous. 
The shepherd will find it difficult or impossible to determine the r coordinate 
for the objects in the sky. He knows the direction of a star but not its distance 
from the origin (which he will naturally take to be himself). Astronomers often 
find themselves in the same situation as the shepherd. A constant theme through- 
out astronomy is the problem of the third dimension, the r-coordinate: the 
directions of objects are easily and accurately determined, but their distances are 
not. This prompts us to use coordinate systems that ignore the r-coordinate and 
only specify the two direction angles. 

In Figure 3.5, we carry the shepherd’s fiction of a hemispherical sky a little 
bit further, and imagine that the hemispherical bowl of the visible sky is 
matched by a similar hemisphere below the horizon, so that we are able to apply 
a spherical coordinate scheme like the one illustrated. Here, the origin of the 
system is at O, the location of the observer. The fundamental plane is that of the 
“flat” Earth (or, to be precise, a plane tangent to the tiny spherical Earth at point 
O). This fundamental plane intersects the sphere of the sky at the celestial 
horizon — the great circle passing through the points NES in the figure. Vertical 
circles are great circles on the spherical sky that are perpendicular to the fun- 
damental plane. All vertical circles pass through the overhead point, which is 
called the zenith (point T in the figure), as well as the diametrically opposed 
point, called the nadir. The vertical circle that runs in the north— south direction 
(circle NTS in the figure) is called the observer’s meridian. 

The fundamental direction in the altitude— azimuth coordinate system runs 
directly north from the observer to the intersection of the meridian and the 
celestial horizon (point N in the figure). In this system, a point on the sky, P, 
has two coordinates: 



Zenith 




Fig. 3.5. The altitude- 
azimuth system. The 
horizon defines the 
fundamental plane (gray) 
and the north point on 
the horizon, N, defines 
the fundamental 
direction. Point P has 
coordinates a (azimuth), 
which is measured along 
the horizon circle from 
north to east, and e 
(altitude), measured 
upwards from the 
horizon. Objects with 
negative altitudes are 
below the horizon. 



66 



Place, time, and motion 



• The altitude, or elevation, is the angular distance of P above the horizon (ZQOP or e 
in the figure). Objects below the horizon have negative altitudes. 

• The azimuth is the angular distance from the reference direction (the north point on 
the horizon) to the intersection of the horizon and the vertical circle passing through 
the object (ZNOQ or a in the figure). 

Instead of the altitude, astronomers sometimes use its complement, z, the zenith 
distance (ZTOP in the figure). 

The (a, e) coordinates of an object clearly describe where it is located in an 
observer’s sky. You can readily imagine an instrument that would measure these 
coordinates: a telescope or other sighting device mounted to rotate on vertical 
and horizontal circles that are marked with precise graduations. 

One of the most elementary astronomical observations, noticed even by the 
most unobservant shepherd, is that celestial objects don’t stay in the same place 
in the horizon coordinate system. Stars, planets, the Sun, and Moon all execute a 
diurnal motion', they rise in the east, cross the observer’s meridian, and set in 
the west. This, of course, is a reflection of the spin of our planet on its axis. The 
altitude and azimuth of celestial objects will change as the Earth executes its 
daily rotation. Careful measurement will show that stars (but not the Sun and 
planets, which move relative to the “fixed” stars) will take about 23 hours, 
56 minutes and 4.1 seconds between successive meridian crossings. This 
period of time is known as one sidereal day. Very careful observations would 
show that the sidereal day is actually getting longer, relative to a stable atomic 
clock, by about 0.0015 second per century. The spin rate of the Earth is slowing 
down. 



3.1.5 The equatorial system: definition of coordinates 

Because the altitude and azimuth of celestial objects change rapidly, we create 
another reference system, one in which the coordinates of stars remain the same. 
In this equatorial coordinate system, we carry the fiction of the spherical sky 
one step further. Imagine that all celestial objects were stuck on a sphere of very 
large radius, whose center is at the center of the Earth. Furthermore, imagine 
that the Earth is insignificantly small compared to this celestial sphere. Now 
adopt a geocentric point of view. You can account for the diurnal motion of 
celestial objects by presuming that the entire celestial sphere spins east to west 
on an axis coincident with the Earth’s actual spin axis. Relative to one another 
objects on the sphere never change their positions (not quite true — see below). 
The star patterns that make up the figures of the constellations stay put, while 
terrestrials observe the entire sky — the global pattern of constellations — to spin 
around its north— south axis once each sidereal day. Objects stuck on the celestial 
sphere thus appear to move east to west across the terrestrial sky, traveling in 
small circles centered on the nearest celestial pole. 



3.1 Astronomical coordinate systems 



67 



The fictional celestial sphere is an example of a scientific model. Although 
the model is not the same as the reality, it has features that help one discuss, 
predict, and understand real behavior. (You might want to think about the 
meaning of the word “understand” in a situation where model and reality differ 
so extensively.) The celestial-sphere model allows us to specify the positions of 
the stars in a coordinate system, the equatorial system, which is independent of 
time, at least on short scales. Because positions in the equatorial coordinate 
system are also easy to measure from Earth, it is the system astronomers use 
most widely to locate objects on the sky. 

The equatorial system nominally chooses the center of the Earth as the origin 
and the equatorial plane of the Earth as the fundamental plane. This aligns the 
z-axis with the Earth’s spin axis, and fixes the locations of the two celestial 
poles — the intersection of the z-axis and the celestial sphere. The great circle 
defined by the intersection of the fundamental plane and the celestial sphere is 
called the celestial equator. One can immediately measure a latitude-like coor- 
dinate with respect to the celestial equator. This coordinate is called the decli- 
nation (abbreviated as Dec or <5), whose value is taken to be zero at the equator, 
and positive in the northern celestial hemisphere; see Figure 3.6. 

We choose the fundamental direction in the equatorial system by observing 
the motion of the Sun relative to the background of “fixed” stars. Because of the 
Earth’s orbital motion, the Sun appears to trace out a great circle on the celestial 
sphere in the course of a year. This circle is called the ecliptic (it is where 
eclipses happen) and intersects the celestial equator at an angle, e, called the 
obliquity of the ecliptic, equal to about 23.5 degrees. The point where the Sun 
crosses the equator traveling from south to north is called the vernal equinox 
and this point specifies the reference direction of the equatorial system. The 
coordinate angle measured in the equatorial plane is called the right ascension 
(abbreviated as RA or a). As shown in Figure 3.6, the equatorial system is right- 
handed, with RA increasing from west to east. 

For reasons that will be apparent shortly, RA is usually measured in hours: 
minutes: seconds, rather than in degrees (24 hours of RA constitute 360 degrees 
of arc at the equator, so one hour of RA is 1 5 degrees of arc long at the equator). 
To deal with the confusion that arises from both the units of RA and the units of 
Dec having the names “minutes” and “seconds”, one can speak of “minutes (or 
seconds) of time ” to distinguish RA measures from the “minutes of arc” used to 
measure Dec. 

3.1.6 The relation between the equatorial and the 
horizon systems 

Figure 3.7 shows the celestial sphere with some of the features of both the 
horizon and equatorial systems marked. The figure assumes an observer, 
“O”, located at about 60 degrees north latitude on Earth. Note the altitude of 



68 



Place, time, and motion 



Fig. 3.6 The equatorial 
coordinate system. In 
both celestial spheres 
pictured, the equator is 
the great circle passing 
through points V and B, 
and the ecliptic is the 
great circle passing 
through points V and S. 
The left-hand sphere 
shows the locations of 
the north (N) and south 
(M) celestial poles, the 
vernal (V) and autumnal 
(A) equinoxes, the 
summer (S) solstice, and 
the hour circles for 0 Hr 
(arc NVM) and 6 Hr (arc 
NBM) of right ascension. 
The right-hand sphere 
shows the right 
ascension (ZVOQ, or a) 
and declination (ZQOP, 
or S) of the point P. 





the north celestial pole (angle NOP in Figure 3.7a). You should be able to 
construct a simple geometric argument to convince yourself that: the altitude 
angle of the north celestial pole equals the observer’s geodetic latitude. 

Observer “O,” using the horizon system, will watch the celestial sphere turn, 
and see stars move along the projected circles of constant declination. Figure 3.7a 
shows the declination circle of a star that just touches the northern horizon. Stars 
north of this circle never set and are termed circumpolar. Figure 3.7a also shows 
the declination circle that just touches the southern horizon circle, and otherwise 
lies entirely below it. Unless she changes her latitude, “O” can never see any of 
the stars south of this declination circle. 

Reference to Figure 3.7a also helps define a few other terms. Stars that are 
neither circumpolar nor permanently below the horizon will rise in the east, 
cross, or transit , the observer’s celestial meridian, and set in the west. When a 
star transits the meridian it has reached its greatest altitude above the horizon, 
and is said to have reached its culmination. Notice that circumpolar stars can 
be observed to cross the meridian twice each sidereal day (once when they are 
highest in the sky, and again when they are lowest). To avoid confusion, the 
observer’s celestial meridian is divided into two pieces at the pole. The smaller 
bit visible between the pole and the horizon (arc NP in the figure) is called 
the lower meridian, and the remaining piece (arc PTML) is called the upper 
meridian. 

Figure 3.7b shows a star, S, which has crossed the upper meridian some 
time ago and is moving to set in the west. A line of constant right ascension is a 
great circle called an hour circle, and the hour circle for star S is shown in the 
figure. 

You can specify how far an object is from the meridian by giving its hour 
angle. The hour circle of an object and the upper celestial meridian intersect at 
the pole. The hour angle, FIA, is the angle between them. Application of the law 
of sines to a spherical right triangle shows that the hour angle could also be 



3.1 Astronomical coordinate systems 



69 



(a) Zenith (b) 





measured along the equator, as the arc that runs from the intersection of the 
meridian and equator to the intersection of the star’s hour circle and the equator 
(arc RM in Figure 3.7b). Hour angle, like right ascension, is usually measured in 
time units. Recalling that RA is measured in the plane of the equator, we can 
state one other definition of the hour angle: 



Fig. 3.7 The horizon and 
equatorial systems. Both 
spheres show the 
horizon, equator and 
observer's meridian, the 
north celestial pole at P, 
and the zenith at T. 
Sphere (a) illustrates the 
diurnal paths of a 
circumpolar star and of a 
star that never rises. 
Sphere (b) shows the 
hour circle (PSR) of a star 
at S, as well as its 
declination, <5, its hour 
angle, HA = arc RM = 
ZMPS, its altitude, e, and 
its zenith distance, z. 



HA of the object = RA on meridian — RA of the object 

The hour angle of a star is useful because it tells how long ago (in the case 
of positive HA) or how long until (negative HA) the star crossed, or will 
cross, the upper meridian. The best time to observe an object is usually when 
it is highest in the sky, that is, when the HA is zero and the object is at 
culmination. 

To compute the hour angle from the formula above, you realize that the RA 
of the object is always known — you can look it up in a catalog or read it from a 
star chart. How do you know the right ascension of objects on the meridian? You 
read that from a sidereal clock. 

A sidereal clock is based upon the apparent motions of the celestial sphere. 
A clockmaker creates a clock that ticks off exactly 24 uniform “sidereal” hours 
between successive upper meridian transits by the vernal equinox (a period of 
about 23.93 “normal” hours, remember). If one adjusts this clock so that it reads 
zero hours at precisely the moment the vernal equinox transits, then it gives the 
correct sidereal time. 



Sidereal day = Time between upper meridian transits 
by the vernal equinox 

A sidereal clock mimics the sky, where the hour circle of the vernal equinox can 
represent the single hand of a 24-hour clock, and the observer’s meridian can 
represent the “zero hour” mark on the clockface. There is a nice correspondence 



70 



Place, time, and motion 



between the reading of any sidereal clock and the right ascension coordinate, 
namely 



sidereal time = right ascension of an object on the upper meridian 

It should be clear that we can restate the definition of hour angle as: 

HA of object = sidereal time now — sidereal time object culminates 

If either the sidereal time or an object’s hour angle is known, one can derive the 
coordinate transformations between equatorial (a, <)) and the horizon (e, a) 
coordinates for that object. Formulae are given in Appendix D. 



3.1.7 Measuring equatorial coordinates. 

Astronomers use the equatorial system because RA and Dec are easily deter- 
mined with great precision from Earth-based observatories. You should have a 
general idea of how this is done. Consider a specialized instrument, called a 
transit telescope (or meridian circle)', the transit telescope is constrained to point 
only at objects on an observer’s celestial meridian — it rotates on an axis aligned 
precisely east— west. The telescope is rigidly attached to a graduated circle cen- 
tered on this axis. The circle lies in the plane of the meridian and rotates with the 
telescope. A fixed index, established using a plumb line perhaps, always points 
to the zenith. By observing where this index falls on the circle, the observer can 
thus determine the altitude angle (or zenith distance) at which the telescope is 
pointing. The observer is also equipped with a sidereal clock, which ticks off 24 
sidereal hours between upper transits of the vernal equinox. 

To use the transit telescope to determine declinations, first locate the celestial 
pole. Pick out a circumpolar star. Read the graduated circle when you observe 
the star cross the upper and then again when it crosses the lower meridian. The 
average of the two readings gives the location of ±90° declination (the north or 
south celestial pole) on your circle. After this calibration you can then read the 
declination of any other transiting star directly from the circle. 

To find the difference in the RA of any two objects, subtract the sidereal 
clock reading when you observe the first object transit from the clock reading 
when you observe the second object transit. To locate the vernal equinox and the 
zero point for the RA coordinate, require that the right ascension of the Sun be 
zero when you observe its declination to be zero in the spring. 

Astrometry is the branch of astronomy concerned with measuring the posi- 
tions, and changes in position, of sources. Chapter 1 1 of Bimey et al. (2006) 
gives a more though introduction to the subject than we will do here, and Monet 
(1988) gives a more advanced discussion. The Powerpoint presentation on the 
Gaia website (http://www.rssd.esa.int/Gaia) gives a good introduction to astr- 
ometry from space. 



3.1 Astronomical coordinate systems 



71 



Observations with a transit telescope can measure arbitrarily large angles 
between sources, and the limits to the accuracy of large-angle astrometry 
are different from, and usually much more severe than, the limits to small- 
angle astrometry. In small-angle astrometry, one measures positions of a source 
relative to a local reference frame (e.g. stars or galaxies) contained on the same 
detector field. Examples of small-angle astrometry are the measurement of 
the separation of double stars with a micrometer-equipped eyepiece, the meas- 
urement of stellar parallax from a series of photographs, or the measurement of 
the position of a minor planet in two successive digital images of the same field. 

The angular size and regularity of the stellar images formed by the transit 
telescope limit the precision of large-angle astrometry. The astronomer or her 
computer must decide when and where the center of the image transits, a task 
made difficult if the image is faint, diffuse, irregular, or changing shape on a 
short time scale. In the optical or near infrared, atmospheric seeing usually 
limits ground-based position measurements to an accuracy of about 0.05 arcsec, 
or 50 milli-arcsec (mas). 

Positional accuracy at radio wavelengths is much greater. The technique of 
very long baseline interferometry (VLBI) can determine coordinates for point- 
like radio sources (e.g. the centers of active galaxies) with uncertainties less 
than 1 mas. Unfortunately, most normal stars are not sufficiently powerful radio 
sources to be detected, and their positions must be determined by optical 
methods. 

There are other sources of error in wide-angle ground-based astrometry. 
Refraction by the atmosphere (see Figure 3.8 and Appendix D) changes the 
apparent positions of radio and (especially) optical sources. Variability of the 
atmosphere can produce inaccuracies in the correction made for refraction. 
Flexure of telescope and detector parts due to thermal expansion or variations 
in gravitational loading can cause serious systematic errors. Any change, for 
example, that moves the vertical index relative to the meridian circle will 
introduce inconsistencies in declination measurements. 

Modem procedures for measuring equatorial coordinates are much more 
refined than those described at the beginning of this section, but the underlying 
principles are the same. Most ground-based transit measurements are automated 
with a variety of electronic image detectors and strategies for determining 
transit times. 

Space-based large-angle astrometry uses principles similar to the ground- 
based programs. Although ground-based transit telescopes use the spinning 
Earth as a platform to define both direction and time scale, any uniformly 
spinning platform and any clock could be equivalently employed. The spin of 
the artificial satellite HIPPARCOS, for example, allowed it to measure stellar 
positions by timing transits in two optical telescopes mounted on the satellite. 
Because images in space are neither blurred by atmospheric seeing or subject to 
atmospheric refraction, most of the 120,000 stars in the HIPPARCOS catalog 




Zenith distance 


90° 


89° 


85° 


70° 


50° 


20° 


Shift in arc min 


35 


24 


10 


2.6 


1.7 


0.3 



Fig. 3.8 Atmospheric 
refraction. The observer is 
on the surface at point O. 
The actual path of a light 
ray from object A is curved 
by the atmosphere, and O 
receives light from 
direction A'. Likewise, the 
image of object B appears 
at B' - a smaller shift in 
position because both the 
path length and the angle 
of incidence are smaller. 
Refraction thus reduces 
the zenith distance of all 
objects, affecting those 
close to the horizon more 
than those near the zenith. 
The table below the figure 
gives approximate shifts 
in arc minutes for different 
zenith distances. 



72 



Place, time, and motion 



have positional accuracies around 0.7 mas in each coordinate. A future mission, 
Gaia (the European Space Agency expects launch in 2012), will use a similar 
strategy with vastly improved technology. Gaia anticipates positional accuracies 
on the order of 0.007 mas (= 7 pas) for bright stars and accuracies better than 0.3 
mas for close to a billion objects brighter than V = 20. 

Catalogs produced with large-angle astrometric methods like transit tele- 
scope observations or the Gaia and HIPPARCOS missions are usually called 
fundamental catalogs. 

It is important to realize that although the relative positions of most “fixed” 
stars on the celestial sphere normally do not change appreciably on time scales 
of a year or so, their equatorial coordinates do change by as much as 50 arcsec 
per year due to precession and other effects. Basically, the location of the 
celestial pole, equator, and equinox are always moving (see Section 3.1.8 
below). This is an unfortunate inconvenience. Any measurement of RA and 
Dec made with a transit circle or other instrument must allow for these changes. 
What is normally done is to correct measurements to compute the coordinates 
that the celestial location would have at a certain date. Currently, the celestial 
equator and equinox for the year 2000 (usually written as J2000.0) are likely to 
be used. 

You should also realize that even the relative positions of some stars, espe- 
cially nearby stars, do change very slowly due to their actual motion in space 
relative to the Sun. This proper motion, although small (a large proper motion 
would be a few arcsec per century), will cause a change in coordinates over 
time, and an accurate specification of coordinates must give the epoch (or date) 
for which they are valid. See Section 3.4.2 below. 

3.1.8 Precession and nutation 

Conservation of angular momentum might lead one to expect that the Earth’s 
axis of rotation would maintain a fixed orientation with respect to the stars. 
Elowever, the Earth has a non-spherical mass distribution, so it does experience 
gravitational torques from the Moon (primarily) and Sun. In addition to this 
lunisolar effect, the other planets produce much smaller torques. As a result of 
all these torques, the spin axis changes its orientation, and the celestial poles and 
equator change their positions with respect to the stars. This, of course, causes 
the RA and Dec of the stars to change with time. 

This motion is generally separated into two components, a long-term general 
trend called precession, and a short-term oscillatory motion called nutation. 
Figure 3.9 illustrates precession: the north ecliptic pole remains fixed with 
respect to the distant background stars, while the north celestial pole (NCP) 
moves in a small circle whose center is at the ecliptic pole. The precessional 
circle has a radius equal to the average obliquity (around 23 degrees), with the 
NCP completing one circuit in about 26,000 years, moving at a very nearly — but 



3.1 Astronomical coordinate systems 



73 




Fig. 3.9 Precession of the 
equinoxes. The location 
of the ecliptic and the 
ecliptic poles is fixed on 
the celestial sphere. The 
celestial equator moves 
so that the north celestial 
pole describes a small 
circle around the north 
ecliptic pole of radius 
equal to the mean 
obliquity. 



not precisely — constant speed. The celestial equator, of course, moves along 
with the pole, and the vernal equinox, which is the fundamental direction for 
both the equatorial and ecliptic coordinate systems, moves westward along the 
ecliptic at the rate (in the year 2000) of 5029.097 arcsec (about 1.4 degrees) per 
century. Precession will in general cause both the right ascension and declina- 
tion of every star to change over time, and will also cause the ecliptic longitude 
(but not the ecliptic latitude) to change as well. 

The most influential ancient astronomer, Hipparchus of Rhodes (recorded 
observations 141—127 BCE) spectacularly combined the rich tradition of 
Babylonian astronomy, which was concerned with mathematical computation 
of future planetary positions from extensive historic records, and Greek astron- 
omy, which focused on geometrical physical models that described celestial 
phenomena. He constructed the first quantitative geocentric models for the 
motion of the Sun and Moon, developed the trigonometry necessary for his 
theory, injected the Babylonian sexagesimal numbering system (360° in a circle) 
into western use, and compiled the first systematic star catalog. Hipparchus 
discovered lunisolar precessional motion, as a steady regression of the equi- 
noxes, when he compared contemporary observations with the Babylonian 
records. Unfortunately, almost all his original writings are lost, and we know 
his work mainly though the admiring Ptolemy, who lived three centuries later. 

Since the time of Hipparchus, the vernal equinox has moved about 30° along 
the ecliptic. In fact, we still refer to the vernal equinox as the “first point of 
Aries,” as did Hipparchus, even though it has moved out of the constellation 
Aries and through almost the entire length of the constellation Pisces since his 
time. Precession also means that the star Polaris is only temporarily located near 
the north celestial pole. About 4500 years ago, at about the time the Egyptians 
constructed the Great Pyramid, the “North Star” was Thuban, the brightest star 



74 



Place, time, and motion 



in Draco. In 12,000 years, the star Vega will be near the pole, and Polaris will 
have a declination of 43°. 

Unlike lunisolar precession, planetary precession actually changes the angle 
between the equator and ecliptic. The result is an oscillation in the obliquity so 
that it ranges from 22° to 24°, with a period of about 41,000 years. At present, 
the obliquity is decreasing from an accepted J2000 value of 23° 26' 21.4" at a 
rate of about 47 arcsec per century. 

Nutation, the short period changes in the location of the NCP, is usually 
separated into two components. The first, nutation in longitude, is an oscillation 
of the equinox ahead of and behind the precessional position, with an amplitude 
of about 9.21 arcsec and a principal period of 18.6 years. The second, nutation in 
obliquity, is a change in the value of the angle between the equator and ecliptic. 
This also is a smaller oscillation, with an amplitude of about 6.86 arcsec and an 
identical principal period. Both components were discovered telescopically by 
James Bradley (1693—1762), the third British Astronomer Royal. 

3.1.9 Barycentric coordinates 

Coordinates measured with a transit telescope from the surface of the moving 
Earth as described in the preceding section are in fact measured in a non- 
inertial reference frame, since the spin and orbital motions of the Earth accel- 
erate the telescope. These apparent equatorial coordinates exhibit variations 
introduced by this non-inertial frame, and their exact values will depend on 
the time of observation and the location of the telescope. Catalogs therefore 
give positions in an equatorial system similar to the one defined as above, but 
whose origin is at the barycenter (center of mass) of the Solar System. Bary- 
centric coordinates use the mean equinox of the catalog date (a fictitious equi- 
nox which moves with precessional motion, but not nutational). The barycentric 
coordinates are computed from the apparent coordinates by removing several 
effects. In addition to precession and nutation, we will discuss two others. The 
first, due to the changing vantage point of the telescope as the Earth executes its 
orbit, is called heliocentric stellar parallax. The small variation in a nearby 
object’s apparent coordinates due to parallax depends on the object’s distance 
and is an important quantity described in Section 3.2.2. 

The second effect, caused by the finite velocity of light, is called the aberra- 
tion of starlight, and produces a shift in every object’s apparent coordinates. 
The magnitude of the shift depends only on the angle between the object’s 
direction and the direction of the instantaneous velocity of the Earth. 
Figure 3.10 shows a telescope in the barycentric coordinate system, drawn so 
that the velocity of the telescope, at rest on the moving Earth, is in the +x 
direction. A photon from a distant object enters the telescope at point A, travels 
at the speed of light, c, and exits at point B. In the barycentric frame, the 
photon’s path makes an angle 0 with the x-axis. However, if the photon is to 



3.1 Astronomical coordinate systems 



75 




Fig. 3.10 The aberration 
of starlight. A telescope 
points towards a source. 
The diagram shows the 
telescope moving to the 
right in the barycentric 
frame. The apparent 
direction of the source. O', 
depends on the direction 
and magnitude of the 
telescope velocity. 



enter and exit the moving telescope successfully, the telescope must make an 
angle O' = 0 — AO with the x-axis in the frame fixed on the Earth. A little 
geometry shows that, if V is the speed of the Earth, 

AO = —sin 6 
c 

Thus aberration moves the apparent position of the source (the one measured by 
a telescope on the moving Earth) towards the x-axis. The magnitude of this 
effect is greatest when 0 = 90°, where it amounts to about 20.5 arcsec. 



3.1.10 The ICRS 

The International Astronomical Union (IAU) in 1991 recommended creation of a 
special coordinate system whose origin is at the barycenter of the Solar System, 
with a fundamental plane approximately coincident with the Earth’s equatorial 
plane in epoch J2000.0. The x-axis of this International Celestial Reference 
System (ICRS) is taken to be in the direction of the vernal equinox on that date. 
However, unlike the equatorial system, or previous barycentric systems, the axes 
of the ICRS are defined and fixed in space by the positions of distant galaxies, not 
by the apparent motion of the Sun. Unlike Solar System objects or nearby stars, 
these distant objects have undetectable angular motions relative to one another. 
Their relative positions do not depend on our imperfect knowledge or observa- 
tions of the Earth’s rotation, precession, and nutation. Thus, the ICRS is a very 
good approximation of an inertial, non-rotating coordinate system. 

In practice, radio-astronomical detenninations of the equatorial coordinates of 
over 200 compact extragalactic sources (mostly quasars) define this inertial 
reference frame in an ongoing observing program coordinated by the Interna- 
tional Earth Rotation Service in Paris. Directions of the ICRS axes are now 
specified to a precision of about 0.02 mas relative to this frame. The ICRS 
positions of optical sources are known primarily through HIPPARCOS and 



76 



Place, time, and motion 



Hubble Space Telescope (HST) observations near the optical counterparts of the 
defining radio sources, as well as a larger number of other radio sources. Approx- 
imately 100,000 stars measured by HIPPARCOS thus have ICRS coordinates 
known with uncertainties typical of that satellite’s measurements, around 1 mas. 
Through the HIPPARCOS measurements, ICRS positions can be linked to the 
Earth-based fundamental catalog positions like FK5 (see Chapter 4). 

3.1.11 The ecliptic coordinate system 

The ecliptic, the apparent path of the Sun on the celestial sphere, can also be 
defined as the intersection of the Earth’s orbital plane with the celestial sphere. 
The orbital angular momentum of the Earth is much greater than its spin angular 
momentum, and the nature of the torques acting on each system suggests that 
the orbital plane is far more likely to remain invariant in space than is the 
equatorial plane. Moreover, the ecliptic plane is virtually coincident with the 
plane of symmetry of the Solar System as well as lying nearly perpendicular to 
the Solar System’s total angular momentum vector. As such, it is an important 
reference plane for observations and dynamical studies of Solar System objects. 

Astronomers define a geocentric coordinate system in which the ecliptic 
is the fundamental plane and the vernal equinox is the fundamental direction. 
Measure ecliptic longitude, X, from west to east in the fundamental plane. 
Measure the ecliptic latitude, /?, positive northward from the ecliptic. Since 
the vernal equinox is also the fundamental direction of the equatorial system, 
the north ecliptic pole is located at RA =18 hours and Dec = 90° — £, where e 
is the obliquity of the ecliptic. 

The ecliptic is so nearly an invariant plane in an inertial system that, unlike 
the equatorial coordinates, the ecliptic latitudes of distant stars or galaxies will 
not change with time because of precession and nutation. Ecliptic longitudes on 
the other hand, are tied to the location of the equinox, which is in turn defined by 
the spin of the Earth, so longitudes will have a precessional change of about 50" 
per year. 

3.1.12 The Galactic coordinate system 

Whoever turns his eye to the starry heavens on a clear night will perceive that 
band of light. . . designated by the name Milky Way. . . it is seen to occupy the 
direction of a great circle, and to pass in uninterrupted connection round the 
whole heavens:. . . so perceptibly different from the indefiniteness of chance, 
that attentive astronomers ought to have been thereby led, as a matter of course, 
to seek carefully for the explanation of such a phenomenon. 

— Immanuel Kant, Universal Natural History and a Theory of the Heavens, 1755 

Kant’s explanation for the Milky Way envisions our own Galaxy as a flattened 
system with approximately cylindrical symmetry composed of a large number of 



3.2 The third dimension 



77 



stars, each similar to the Sun. Astronomers are still adding detail to Kant’s 
essentially correct vision: we know the Sun is offset from the center by a large 
fraction of the radius of the system, although the precise distance is uncertain by at 
least 5%. We know the Milky Way, if viewed from above the plane, would show 
spiral structure, but are uncertain of its precise form. Astronomers are currently 
investigating extensive evidence of remarkable activity in the central regions. 

It is clear that the central plane of the disk-shaped Milky Way Galaxy is 
another reference plane of physical significance. Astronomers have specified a 
great circle (the Galactic plane) that approximates the center-line of the Milky 
Way on the celestial sphere to constitute the fundamental plane of the Galactic 
coordinate system. We take the fundamental direction to be the direction of the 
center of the galaxy. Galactic latitude ( b or b 11 ) is then measured positive north 
(the Galactic hemisphere contains the north celestial pole) of the plane, and 
Galactic longitude (/ or l 11 ) is measured from Galactic center so as to constitute a 
right-handed system. 

Since neither precession nor nutation affects the Galactic latitude and longi- 
tude, these coordinates would seem to constitute a superior system. However, it 
is difficult to measure / and b directly, so the Galactic coordinates of any object 
are in practice derived from its equatorial coordinates. The important parame- 
ters are that the north Galactic pole (b = +90°) is defined to be at 

a = 12:49:00,(5 = +27.4° (equator and equinox of 1950) 
and the Galactic center (/ = b = 0) at 

a = 17 :42 :24, S = — 28 0 55 , (equator and equinox of 1950) 

3.1.13 Transformation of coordinates 

Transformation of coordinates involves a combination of rotations and (some- 
times) translations. Note that for very precise work, (the transformation of geo- 
centric to ICRS coordinates, for example) some general-relativistic modeling 
may be needed. 

Some of the more common transformations are addressed in the various 
national almanacs, and for systems related just by rotation (equatorial and 
Galactic, for example), you can work transformations out by using spherical 
trigonometry (see Section 3.1.2). Some important transformations are given in 
Appendix D, and calculators for most can be found on the Internet. 

3.2 The third dimension 

Determining the distance of almost any object in astronomy is notoriously 
difficult, and uncertainties in the coordinate r are usually enormous compared 
to uncertainties in direction. For example, the position of Alpha Centauri, the 
nearest star after the Sun, is uncertain in the ICRS by about 0.4 mas (three parts 




Fig. 3.11 Radar ranging to 
Venus. The astronomical 
unit is the length of the 
line ES, which scales with 
EV, the Earth-to-Venus 
distance. 



Place, time, and motion 



in 10 9 of a full circle), yet its distance, one of the best known, is uncertain by 
about one part in 2500. A more extreme example would be one of the quasars 
that define the ICRS, with a typical positional uncertainty of 0.02 mas (six parts 
in 10 10 ). Estimates of the distances to these objects depend on our understanding 
of the expansion and deceleration of the Universe, and are probably uncertain by 
at least 10%. This section deals with the first two rungs in what has been called 
the “cosmic distance ladder,” the sequence of methods and calibrations that 
ultimately allow us to measure distances (perhaps “estimate distances” would 
be a better phrase) of the most remote objects. 

3.2.1 The astronomical unit 

We begin in our own Solar System. Kepler’s third law gives the scale of 
planetary orbits: 

a = P 2 / 3 

where a is the average distance between the planet and the Sun measured in 
astronomical units (AU, or, preferably, au) and P is the orbital period in years. 
This law sets the relative sizes of planetary orbits. One au is defined to be the 
mean distance between the Earth and Sun, but the length of the au in meters, and 
the absolute scale of the Solar System, must be measured empirically. 

Figure 3.11 illustrates one method for calibrating the au. The figure shows the 
Earth and the planet Venus when they are in a position such that apparent 
angular separation between Venus and the Sun, as seen from Earth, (the elon- 
gation of Venus) is at a maximum. At this moment, a radio (radar) pulse is sent 
from the Earth towards Venus, and a reflected pulse returns after elapsed time 
At. The Earth-to-Venus distance is just 




Thus, from the right triangle in the figure, the length of the line ES is 
one au or 



Clearly, some corrections need to be made because the orbit of neither planet is 
a perfect circle, but the geometry is known rather precisely. Spacecraft in orbit 
around Venus and other planets (Mars, Jupiter, and Saturn) also provide the 
opportunity to measure light-travel times, and similar geometric analyses yield 
absolute orbit sizes. The presently accepted value for the length of the au is 

1 au = 1.49 5978 X 10" m 
with an uncertainty of 1 part in 10 6 . 



3.2 The third dimension 



79 




Fig. 3.12 The parallactic 
ellipse. The apparent 
position of the nearby 
star, S, as seen from 
Earth, traces out an 
elliptical path on the very 
distant celestial sphere as 
a result of the Earth's 
orbital motion. 



3.2.2 Stellar parallax 

Once the length of the au has been established, we can determine the distances 
to nearby stars through observations of heliocentric stellar parallax. Figure 
3.12 depicts the orbit of the Earth around the Sun. The plane of the orbit is 
the ecliptic plane, and we set up a Sun-centered coordinate system with the 
ecliptic as the fundamental plane, the z-axis pointing towards the ecliptic pole, 
and the y-axis chosen so that a nearby star, S, is in the y—z plane. The distance 
from the Sun to S is r. As the Earth travels in its orbit, the apparent position of 
the nearby star shifts in relation to very distant objects. Compared to the back- 
ground objects, the nearby star appears to move around the perimeter of the 
parallactic ellipse, reflecting the Earth’s orbital motion. 

Figure 3.13 shows the plane that contains the x-axis and the star. The parallax 
angle,/?, is half the total angular shift in the star’s position (the semi-major axis 
of the parallactic ellipse in angular units). From the right triangle formed by the 
Sun— star— Earth: 



a 

tan p = - 
r 



where a is one au. Since p is in every case going to be very small, we make the 
small angle approximation: for/><C 1: 

tan p = sin p = p 

So that for any right triangle where p is small: 

a 



80 




S 



Fig. 3.13 The parallax 
angle. 



Place, time, and motion 



In this equation, it is understood that a and r are measured in the same units (aus, 
for example) and p is measured in radians. Radian measure is somewhat incon- 
venient for small angles, so, noting that there are about 206 265 arcsec per 
radian, we can rewrite the small-angle formula as 

p[arcsec] = 206 265 - [a, r in same units] 

Finally, to avoid very large numbers for r, it is both convenient and traditional to 
define a new unit, the parsec, with the length: 

1 parsec = 206265 au = 3.085 678 X 10 16 m = 3.261 633 light years 

The parsec (pc) is so named because it is the distance of an object whose 
/rural I ax is one second of arc. With the new unit, the parallax equation 
becomes: 

r i afaul , , 

P[arCS6Cl = ^pc] (3 } 

This equation represents a fundamental relationship between the small angle 
and the sides of the astronomical triangle (any right triangle with one very short 
side). For example, suppose a supergiant star is 20 pc away, and we measure its 
angular diameter with the technique of speckle interferometry as 0.023 arcsec. 
Then the physical diameter of of the star, which is the short side of the relevant 
astronomical triangle (the quantity a in Equation (3.1)), must be 20 X 0.023 pc 
arcsec = 0.46 au. 

In the case of stellar parallax, the short side of the triangle is always 1 au. If 
a = 1 in Equation (3.1), we have: 

plarcsecl = -=■ — =■ (3.2) 

'•[pc] 

In the literature, the parallax angle is often symbolized as n instead of p. Note 
that the parallactic ellipse will have a semi-major axis equal to p, and a semi- 
minor axis equal to p sin 1, where X is the ecliptic latitude of the star. The axes of 
an ellipse fit to multiple observations of the position of a nearby star will there- 
fore estimate its parallax. 

There are, of course, uncertainties in the measurement of small angles like 
the parallax angle. Images of stars formed by Earth-based telescopes are typi- 
cally blurred by the atmosphere, and are seldom smaller than a half arc second 
in diameter, and are often much larger. In the early days of telescopic astronomy, 
a great visual observer, James Bradley (1693—1762), like many astronomers 
before him, undertook the task of measuring stellar parallax. Bradley could 
measure stellar positions with a precision of about 0.5 arcsec (500 milli- 
arcseconds or mas). This precision was sufficient to discover the phenomena 
of nutation and aberration, but not to detect a stellar parallax (the largest 




